My search query is really slow

I indexed millions of documents already and it as id and words.
Both are text.
In words, there are about 900 numeric words with white space between words.
for example 12 13 15 234 5454 346 3477 ... 935383 1283129 (about 900 words)
I believe these words are broken into multiple keywords when it's indexed.

Index mapping:

    id : text
    words : text

And I want to search by another words which also has 900 numeric words. So I created my search query like below:

My search query:

    {
        "size": 1,
        "query": {
            "function_score": {
                "query": {
                    "match_all": {
                        "boost": 1
                    }
                },
                "functions": [
                    {
                        "filter": {
                            "match": {
                                "words": {
                                    "query": "17932",
                                    "operator": "OR",
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "fuzzy_transpositions": true,
                                    "lenient": false,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "boost": 1
                                }
                            }
                        },
                        "weight": 1
                    },
                    ... ***there are 900 filters*** ...
                    {
                        "filter": {
                            "match": {
                                "words": {
                                    "query": "16516932",
                                    "operator": "OR",
                                    "prefix_length": 0,
                                    "max_expansions": 50,
                                    "fuzzy_transpositions": true,
                                    "lenient": false,
                                    "zero_terms_query": "NONE",
                                    "auto_generate_synonyms_phrase_query": true,
                                    "boost": 1
                                }
                            }
                        },
                        "weight": 1
                    }
                ],
                "score_mode": "sum",
                "max_boost": 3.4028235e+38,
                "min_score": 100,
                "boost": 1
            }
        },
        "_source": {
            "includes": [
                "id"
            ],
            "excludes": []
        },
        "sort": [
            {
                "_score": {
                    "order": "desc"
                }
            }
        ]
    }

It was working fine when I had less number of documents indexed in my elasticsearch.
But the query started getting timeout.

I have tried many things to optimize the cluster

  1. I have enough number of nodes shards and replicas.
  2. Index is turned off so CPU utilization was low.
  3. Changed to instance type with more CPUs.

I think the last thing I can try is to fix the search query to speed up its execution. Does anyone have any idea?

Can you describe the expected behaviour of you query? Are you looking to find documents that contain all 900 words queried? If that is the case, have you tried concatenating the words into a space separated string and use this in a single match query with minimum_should_match set to 100%?

Would a terms query be simpler?

"terms": {"words": [1,2]}

This worked for me on a text field.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.