We recently migrated our Elastic cluster from version 7.7.1 to 7.17.0, and have been noticing that the query cache keeps growing exponentially if indexing is running on the nodes simultaneously with reasonably high search load. This leads to high latency on the search queries. We have the same heap, cache, and cluster settings on both the old and the new versions but this was not an issue on the old version.
We even set the indices.queries.cache.size to 5%, which as per the documentation should have restricted the size of the query cache to 5% of the total heap size and while this was true for some time, after a random number of index cycles, it shot up again.
Details about the Elastic cluster -
- We have five data nodes and each node has 2 indices.
- One index has 5 shards with 2 replicas each. The second index has 10 shards with 2 replicas each.
- All data nodes have a heap size of 12 GB.
- The indexing operation runs every 30mins on a delta data set.
Cache snapshot from the old cluster where the query cache size stays somewhat stagnant -
Cache snapshot from the new cluster where the query cache size keeps growing -
Output of the query "_stats/query_cache?human" from the new cluster -
{
"_shards": {
"total": 45,
"successful": 45,
"failed": 0
},
"_all": {
"primaries": {
"query_cache": {
"memory_size": "29.2gb",
"memory_size_in_bytes": 31356977376,
"total_count": 1462486084,
"hit_count": 126974108,
"miss_count": 1335511976,
"cache_size": 0,
"cache_count": 51339608,
"evictions": 51339608
}
},
"total": {
"query_cache": {
"memory_size": "95.6gb",
"memory_size_in_bytes": 102698467866,
"total_count": 4905909848,
"hit_count": 413144281,
"miss_count": 4492765567,
"cache_size": 0,
"cache_count": 175519223,
"evictions": 175519223
}
}
},
"indices": {
"index_name_1": {
"uuid": "sRAA8PCrQuywlDaTAymh9A",
"primaries": {
"query_cache": {
"memory_size": "19.7gb",
"memory_size_in_bytes": 21167687989,
"total_count": 0,
"hit_count": 0,
"miss_count": 0,
"cache_size": 0,
"cache_count": 0,
"evictions": 0
}
},
"total": {
"query_cache": {
"memory_size": "63.7gb",
"memory_size_in_bytes": 68465645244,
"total_count": 0,
"hit_count": 0,
"miss_count": 0,
"cache_size": 0,
"cache_count": 0,
"evictions": 0
}
}
},
"index_name_2": {
"uuid": "1OmlWuhCRFeZkC05QjYKNQ",
"primaries": {
"query_cache": {
"memory_size": "9.4gb",
"memory_size_in_bytes": 10189289387,
"total_count": 1462486084,
"hit_count": 126974108,
"miss_count": 1335511976,
"cache_size": 0,
"cache_count": 51339608,
"evictions": 51339608
}
},
"total": {
"query_cache": {
"memory_size": "31.8gb",
"memory_size_in_bytes": 34232822622,
"total_count": 4905909848,
"hit_count": 413144281,
"miss_count": 4492765567,
"cache_size": 0,
"cache_count": 175519223,
"evictions": 175519223
}
}
}
}
}
We have been struggling with this since the past two weeks, reaching out to see if there are some pointers on what we can check next? If there are any configurations that need to be set along with the queries.cache.size.