Increased latency with query cache

Cluster stats:
Version: OpenDistro 6.8.17
Elasticsearch: 6 data nodes, 202 primary shards, 412 total shards
CPU model: Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz
CPU total logical cores: 8
Memory : 32GB (16gb heap)

Hello, we have an issue with latency in our cluster. Currently we have big index that's around 150gb and "nested mapping". Right now client can't split this index into smaller parts or change it from his side. So we added 3 new nodes to improve latency issue for a time being.

After we added nodes (11/03 00:00), latency dropped and it was around 100 ms for the day. But started on around 11/04 23:00 latency started increasing.

Compared graphs on a day with low latency and discovered that graph "Query cache evictions" getting bigger every day.

"Query cache evictions" graph when latency was low (after 3 new nodes were added):
low_latency

"Query cache evictions" graph when latency srated to grow (day after 3 new nodes were added):
high_latency

Current "Query cache evictions" graph:
current

Query cache stats on nodes:

node5
      "indices" : {
        "query_cache" : {
          "memory_size_in_bytes" : 1170057215,
          "total_count" : 369062289,
          "hit_count" : 122639543,
          "miss_count" : 246422746,
          "cache_size" : 77436,
          "cache_count" : 193795,
          "evictions" : 116359
        }
node4
      "indices" : {
        "query_cache" : {
          "memory_size_in_bytes" : 1020056809,
          "total_count" : 421283305,
          "hit_count" : 115667922,
          "miss_count" : 305615383,
          "cache_size" : 78128,
          "cache_count" : 221919,
          "evictions" : 143791
        }
      }
node1
      "indices" : {
        "query_cache" : {
          "memory_size_in_bytes" : 1203386167,
          "total_count" : 1848846448,
          "hit_count" : 617771905,
          "miss_count" : 1231074543,
          "cache_size" : 86745,
          "cache_count" : 1210253,
          "evictions" : 1123508
        }
      }
node6
     "indices" : {
        "query_cache" : {
          "memory_size_in_bytes" : 1181094265,
          "total_count" : 457289610,
          "hit_count" : 139265810,
          "miss_count" : 318023800,
          "cache_size" : 77452,
          "cache_count" : 202133,
          "evictions" : 124681
        }
      }
node3
      "indices" : {
        "query_cache" : {
          "memory_size_in_bytes" : 1187569041,
          "total_count" : 2337253591,
          "hit_count" : 655115448,
          "miss_count" : 1682138143,
          "cache_size" : 93744,
          "cache_count" : 1433040,
          "evictions" : 1339296
        }
      }
node2
      "indices" : {
        "query_cache" : {
          "memory_size_in_bytes" : 1079767960,
          "total_count" : 1958652134,
          "hit_count" : 449480825,
          "miss_count" : 1509171309,
          "cache_size" : 67218,
          "cache_count" : 924891,
          "evictions" : 857673
        }
      }

My question is:

  1. Does the increased amount of "query cache evictions" also increases latency?
  2. Will it help to increase "indices.queries.cache.size" parameter? Currently it's 10% from 16GB so around ~1gb. Is it safe to increase it to 20% or 30%?
  3. Any other way I can improve this?

Can anyone help?
Thanks!

A cache eviction will generate a delay, yes.
Ultimately, other than flattening some of your nesting structures, the best way to deal with this is more resources (nodes, heap, CPU etc).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.