First time query with different params is taking a lot of time

Hi ,

I have around 20 Lakh records . I use a single index . The total doc size comes around 6.2GB.
Whenever a new search is performed , the query takes a lot of time to execute . If I run the same query again , it executes very quickly . I understand cache is coming in to play here .

Is there a way I can reduce the time taken for the query being executed first time .

Note : I use a single node with 2 replicas and 5 shards.

Which version of elasticsearch are you using and how do your query and mapping look like?

Am using version 6.2.4

Am using a UI library called Reactive search to connect to elastic search

{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":0,"aggs":{"docType.keyword":{"terms":{"field":"docType.keyword","size":100,"order":{"_term":"asc"}}}}}
{"preference":"Products"}
{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":0,"aggs":{"metadata.products.keyword":{"terms":{"field":"metadata.products.keyword","size":10,"order":{"_count":"desc"}}}}}
{"preference":"Region"}
{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":0,"aggs":{"metadata.region.keyword":{"terms":{"field":"metadata.region.keyword","size":5,"order":{"_count":"desc"}}}}}

{"preference":"result"}
{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":10,"from":0,"sort":[{"_score":{"order":"desc"}}]}

I have a set of UI side filters such as Products, DocType etc .

Am using some analysers and ngram filters for some fields

                          "filter": {
                              "nGram_filter": {
                                 "type": "edge_ngram",
                                 "min_gram": 3,
                                 "max_gram": 6,
                                 "token_chars": [
                                    "letter",
                                    "digit",
                                    "punctuation",
                                    "symbol"
                                 ]
                              }
                           },
                          "analyzer": {
                              "custom_analyser": {
                                  "tokenizer": "standard",
                                  "filter": ["standard", "lowercase", "stop", "porter_stem"]
                              },
                              "nGram_analyzer": {
                                  "type": "custom",
                                  "tokenizer": "standard",
                                  "filter": [
                                     "lowercase",
                                     "nGram_filter",
                                     "stop",
                                     "porter_stem"
                                  ]
                               },
                               "whitespace_analyzer": {
                                  "type": "custom",
                                  "tokenizer": "standard",
                                  "filter": [
                                     "standard",
                                     "lowercase",
                                     "stop",
                                     "porter_stem"
                                  ]
                               }            
                        },
                  ```

Mapping

                     "properties": {
                        "date": {
                            "type": "date"
                        },
	            "title": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256

                                },
                                "analysed" : {
                                    "type": "text",
                                    "analyzer": "nGram_analyzer",
                                    "search_analyzer": "whitespace_analyzer" 
                                }
                            }
                        },
                        "description": {
                            "type": "text",
                            "store":true,
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256

                                },
                                "analysed" : {
                                    "type": "text",
                                    "analyzer": "custom_analyser",
                                    "search_analyzer": "custom_analyser" 
                                }
                            }
                        },
                        "docType": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword"
                                }
                            }
                        },
                        "id": {
                            "type": "text"
                        },
                        "metadata": {
                            "properties": {
                                "author": {
                                    "type": "text",
                                    "fields": {
                                        "keyword": {
                                            "type": "keyword",
                                            "ignore_above": 256
                                        }
                                    }
                                },
                        ......      
                    }```

Am using version 6.2.4

Am using a UI library called Reactive search to connect to elastic search

{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":0,"aggs":{"docType.keyword":{"terms":{"field":"docType.keyword","size":100,"order":{"_term":"asc"}}}}}
{"preference":"Products"}
{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":0,"aggs":{"metadata.products.keyword":{"terms":{"field":"metadata.products.keyword","size":10,"order":{"_count":"desc"}}}}}
{"preference":"Region"}
{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":0,"aggs":{"metadata.region.keyword":{"terms":{"field":"metadata.region.keyword","size":5,"order":{"_count":"desc"}}}}}

{"preference":"result"}
{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must":{"query_string":{"query":"test data","fields":["title^3000","description^50","title.analysed^100","description.analysed^20","date","metadata.createdBy","metadata.author","metadata.products","metadata.country","metadta.region","metadata.spaceKey^0.1","metadata.path","metadata.tags^0.1","metadata.channel^0.1","metadata.key"],"type":"best_fields","default_operator":"and","fuzziness":1,"phrase_slop":2}},"should":[{"range":{"date_original":{"boost":100,"gte":"now-1d/d"}}},{"range":{"date_original":{"boost":90,"gte":"now-7d/d"}}},{"range":{"date_original":{"boost":80,"gte":"now-14d/d"}}},{"range":{"date_original":{"boost":70,"gte":"now-1M/M"}}},{"range":{"date_original":{"boost":50,"gte":"now-3M/M"}}},{"range":{"date_original":{"boost":30,"gte":"now-6M/M"}}},{"range":{"date_original":{"boost":10,"gte":"now-12M/M"}}}],"filter":{"terms":{"docType.keyword":["Box","Confluence","IntegrationDirectory","DevPortal","MTSFaq","Postman","ProductCentral","StackOverflow","Tools","CircleHD","GitHub","Jira","CartRef"]}}}}]}}]}},"highlight":{"force_source":true,"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"description.analysed":{},"description":{},"title.analysed":{},"title":{}},"number_of_fragments":0},"size":10,"from":0,"sort":[{"_score":{"order":"desc"}}]}

I have a set of UI side filters such as Products, DocType etc .

Could you make a couple of experiments - 1) remove all ngram field searches from your query and 2) remove aggregation portion of your query and see if it improves the performance of the first query? Could you also quantify "a lot of time to execute"?

If you have 1 node, why did you set 2 replicas? How many records do you expect to have in this index?

I have about 20 Lakh records in the index . I have only a single index .

I was just trying out by increasing the shards and replicas . 0 replicas and 1 shards will do is it ?

Sure Will try these options . When I say a lot of time , sometimes the query takes about 5 seconds . And if the search text is big and has many words , it takes like 10 seconds + .

But if I execute the same query again , it executes in milli seconds.

I tried to remove aggregations queries and ngrams . they do help a little in quickening up the search . but still it takes time for the first time query . Note . for one of the fields description I have kept store : true . This field contains a large amount of text.

When I checked the n/w tab , I see the queries is returning close to 800KB - 1.2 MB of data for some requests . Does it also play a part in slowing the query ?. Though the same query executed again returns in few milliseconds

What I think is happening here is that your queries produce a very large number of tokens, these tokens requires a lot of disk seeks, if you have spinning disks and not much memory for file system cache, it might take quite a bit of time. Once these disk segments are loaded into file system cache it becomes fast.

How is elasticsearch node setup? What kind of disks do you have how much memory and what's the heap size for elasticsearch?

When I run the cluster stats . This is the response I get .

    "_nodes": {
        "total": 1,
        "successful": 1,
        "failed": 0
    },
    "cluster_name": "elasticsearch",
    "timestamp": 1545921427718,
    "status": "green",
    "indices": {
        "count": 1,
        "shards": {
            "total": 2,
            "primaries": 2,
            "replication": 0,
            "index": {
                "shards": {
                    "min": 2,
                    "max": 2,
                    "avg": 2
                },
                "primaries": {
                    "min": 2,
                    "max": 2,
                    "avg": 2
                },
                "replication": {
                    "min": 0,
                    "max": 0,
                    "avg": 0
                }
            }
        },
        "docs": {
            "count": 2072621,
            "deleted": 453
        },
        "store": {
            "size": "6gb",
            "size_in_bytes": 6510500026
        },
        "fielddata": {
            "memory_size": "0b",
            "memory_size_in_bytes": 0,
            "evictions": 0
        },
        "query_cache": {
            "memory_size": "7.5mb",
            "memory_size_in_bytes": 7882536,
            "total_count": 72746,
            "hit_count": 23652,
            "miss_count": 49094,
            "cache_size": 4680,
            "cache_count": 5348,
            "evictions": 668
        },
        "completion": {
            "size": "0b",
            "size_in_bytes": 0
        },
        "segments": {
            "count": 55,
            "memory": "11mb",
            "memory_in_bytes": 11610299,
            "terms_memory": "9.7mb",
            "terms_memory_in_bytes": 10255840,
            "stored_fields_memory": "1.1mb",
            "stored_fields_memory_in_bytes": 1252448,
            "term_vectors_memory": "0b",
            "term_vectors_memory_in_bytes": 0,
            "norms_memory": "49.3kb",
            "norms_memory_in_bytes": 50496,
            "points_memory": "32.7kb",
            "points_memory_in_bytes": 33535,
            "doc_values_memory": "17.5kb",
            "doc_values_memory_in_bytes": 17980,
            "index_writer_memory": "0b",
            "index_writer_memory_in_bytes": 0,
            "version_map_memory": "0b",
            "version_map_memory_in_bytes": 0,
            "fixed_bit_set": "0b",
            "fixed_bit_set_memory_in_bytes": 0,
            "max_unsafe_auto_id_timestamp": -1,
            "file_sizes": {}
        }
    },
    "nodes": {
        "count": {
            "total": 1,
            "data": 1,
            "coordinating_only": 0,
            "master": 1,
            "ingest": 1
        },
        "versions": [
            "6.2.4"
        ],
        "os": {
            "available_processors": 4,
            "allocated_processors": 4,
            "names": [
                {
                    "name": "Linux",
                    "count": 1
                }
            ],
            "mem": {
                "total": "25.5gb",
                "total_in_bytes": 27389636608,
                "free": "307.5mb",
                "free_in_bytes": 322494464,
                "used": "25.2gb",
                "used_in_bytes": 27067142144,
                "free_percent": 1,
                "used_percent": 99
            }
        },
        "process": {
            "cpu": {
                "percent": 0
            },
            "open_file_descriptors": {
                "min": 224,
                "max": 224,
                "avg": 224
            }
        },
        "jvm": {
            "max_uptime": "167.4d",
            "max_uptime_in_millis": 14464554319,
            "versions": [
                {
                    "version": "1.8.0_151",
                    "vm_name": "OpenJDK 64-Bit Server VM",
                    "vm_version": "25.151-b12",
                    "vm_vendor": "Oracle Corporation",
                    "count": 1
                }
            ],
            "mem": {
                "heap_used": "2gb",
                "heap_used_in_bytes": 2167123472,
                "heap_max": "7.9gb",
                "heap_max_in_bytes": 8555069440
            },
            "threads": 66
        },
        "fs": {
            "total": "145.3gb",
            "total_in_bytes": 156067389440,
            "free": "126.2gb",
            "free_in_bytes": 135509594112,
            "available": "126.1gb",
            "available_in_bytes": 135492816896
        },
        "plugins": [],
        "network_types": {
            "transport_types": {
                "netty4": 1
            },
            "http_types": {
                "netty4": 1
            }
        }
    }
}```

And what kind of disks are these?

PersistentDisk - ext4 type

ATA Device with non-removable media.

This ES server runs on GCP machine

Do you run anything else on this machine? If not then I am not really sure what can be done. Basically, after you run a few queries like this, all necessary pieces of your index will end up in the file system cache and unless you run some other processes on this machine they will stay there. So performance should significantly improve as the node "warms up".

Yes . I also have MongoDB server running in the same machine.
Is there a way I can pre cache the contents ?

We don't have anything built-in in 6.2.4, and if you are running MongoDB it will probably evict elasticsearch-related files over time anyway unless you are searching in elasticsearch all that time.

Ok thanks for the info :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.