Query taking longer times than expected, possible ways of optimization at query level

I have an Elasticsearch 8.6.0 instance with some data in it. I have been querying data off it using query string with 130 keywords in the following format:

{
    "query": {
        "bool": {
            "should": [
                {
                            "bool": {
                                "must": [
                                    {
                                        "query_string": {
                                            "fields": [
                                                "field1.subfield1"
                                            ],
                                            "query": '("project manager" | "project management" | "product management" | ....| "product manager")'
                                        }
                                    }
                                ],
                                "must_not": [
                                    {
                                        "nested": {
                                            "path": "field2",
                                            "query": {
                                                "query_string": {
                                                    "fields": [
                                                        "field2.subfield1",
                                                        "field2.subfield2",
                                                        "field2.subfield3",
                                                        "field2.subfield4",
                                                        "field2.subfield5",
                                                    ],
                                                    "query": '("project manager" | "project management" | "product management" | ....| "product manager")'
                                                }
                                            },
                                        }
                                    }
                                ],
                            }
                        },
            ],
            "filter": [],
            "must_not": [],
        }
    },
    "size": 100
}

However, the query takes more than 100 ms to finish, between 140-150 ms.

I have been using Elasticsearch 8.6.0.

The number of documents in the index is 560 and each document is around 550 kilobytes. It has 1 shard and 1 replica.

The JVM heap size of the cluster is 4 GiB and the memory of the cluster is 8 GiB with 4 allocated processors, in a Linux environment (Ubuntu 20.04.5 LTS).

The data mapping consists of mostly text fields and a few vector fields. There is one nested data type with various text subfields and a few numeric types, (long and float).

The expectation is the query should finish in less than 100 ms, what could be the causes for such times and any suggestions on improving the query times for these queries?

Hi @nadeem.akhter

This answer is old but it may help you. I see that you work with keywords but your query does not use any filter. As the answer below says, it's complicated to give performance advice but maybe following some tips there you can achieve your goal.

Thank you for your answer @RabBit_BR ,

filtering does not work because this query is a subset of a larger query where we need certain scoring functionality.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.