I have an Elasticsearch 8.6.0 instance with some data in it. I have been querying data off it using query string with 130 keywords in the following format:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"query_string": {
"fields": [
"field1.subfield1"
],
"query": '("project manager" | "project management" | "product management" | ....| "product manager")'
}
}
],
"must_not": [
{
"nested": {
"path": "field2",
"query": {
"query_string": {
"fields": [
"field2.subfield1",
"field2.subfield2",
"field2.subfield3",
"field2.subfield4",
"field2.subfield5",
],
"query": '("project manager" | "project management" | "product management" | ....| "product manager")'
}
},
}
}
],
}
},
],
"filter": [],
"must_not": [],
}
},
"size": 100
}
However, the query takes more than 100 ms to finish, between 140-150 ms.
I have been using Elasticsearch 8.6.0.
The number of documents in the index is 560 and each document is around 550 kilobytes. It has 1 shard and 1 replica.
The JVM heap size of the cluster is 4 GiB and the memory of the cluster is 8 GiB with 4 allocated processors, in a Linux environment (Ubuntu 20.04.5 LTS).
The data mapping consists of mostly text fields and a few vector fields. There is one nested data type with various text subfields and a few numeric types, (long and float).
The expectation is the query should finish in less than 100 ms, what could be the causes for such times and any suggestions on improving the query times for these queries?