Query taking longer times than expected, possible ways of optimization at query level

nadeem.akhter · May 24, 2023, 12:37pm

I have an Elasticsearch 8.6.0 instance with some data in it. I have been querying data off it using query string with 130 keywords in the following format:

{
    "query": {
        "bool": {
            "should": [
                {
                            "bool": {
                                "must": [
                                    {
                                        "query_string": {
                                            "fields": [
                                                "field1.subfield1"
                                            ],
                                            "query": '("project manager" | "project management" | "product management" | ....| "product manager")'
                                        }
                                    }
                                ],
                                "must_not": [
                                    {
                                        "nested": {
                                            "path": "field2",
                                            "query": {
                                                "query_string": {
                                                    "fields": [
                                                        "field2.subfield1",
                                                        "field2.subfield2",
                                                        "field2.subfield3",
                                                        "field2.subfield4",
                                                        "field2.subfield5",
                                                    ],
                                                    "query": '("project manager" | "project management" | "product management" | ....| "product manager")'
                                                }
                                            },
                                        }
                                    }
                                ],
                            }
                        },
            ],
            "filter": [],
            "must_not": [],
        }
    },
    "size": 100
}

However, the query takes more than 100 ms to finish, between 140-150 ms.

I have been using Elasticsearch 8.6.0.

The number of documents in the index is 560 and each document is around 550 kilobytes. It has 1 shard and 1 replica.

The JVM heap size of the cluster is 4 GiB and the memory of the cluster is 8 GiB with 4 allocated processors, in a Linux environment (Ubuntu 20.04.5 LTS).

The data mapping consists of mostly text fields and a few vector fields. There is one nested data type with various text subfields and a few numeric types, (long and float).

The expectation is the query should finish in less than 100 ms, what could be the causes for such times and any suggestions on improving the query times for these queries?

RabBit_BR · May 24, 2023, 4:25pm

Hi @nadeem.akhter

This answer is old but it may help you. I see that you work with keywords but your query does not use any filter. As the answer below says, it's complicated to give performance advice but maybe following some tips there you can achieve your goal.

nadeem.akhter · June 1, 2023, 4:14am

Thank you for your answer @RabBit_BR ,

filtering does not work because this query is a subset of a larger query where we need certain scoring functionality.

system · June 29, 2023, 4:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Slower query_string query in elasticsearch 7.5 as compared to elasticsearch 2.4 Elasticsearch	9	787	October 11, 2021
Why is my query slow? Elasticsearch	9	7294	July 5, 2017
Slow bool query Elasticsearch	3	588	May 4, 2018
Performance issue for search query is taking taking around 5 to 6 secs Elasticsearch	3	344	July 6, 2017
Slow query performance Elasticsearch	2	279	July 6, 2017

Query taking longer times than expected, possible ways of optimization at query level

Related topics