I am encountering degraded query latency in v8. We are upgrading our cluster from 7.16.2 to 8.8.2 by standing up a new duplicate cluster with the new version and reindexing the data to it. The latency is 500ms to several seconds in v8 vs. 20-200ms in v7. To compare the query performance, a subset of the queries that the v7 cluster receives are also executed on the v8 cluster.
Why is the query latency much higher in the v8 cluster (500ms to several seconds in v8 vs 20-200 milliseconds in v7) ? Most of the slow queries are non-reproducible (i.e. the latency does not persist on a subsequent query unless if we clear the cache).
Some useful details of the clusters -
-
~100+ indices with sizes ranging from a few MBs up to 10s of TBs and num_of_shards ranging from 5 to 2048 and 1 replica.
-
The v8 cluster indices have equal or more number of shards for each of the indices as compared to their corresponding indices in the v7 cluster - so as to have ~20gb per shard size in the v8 cluster for all the indices.
-
Our ES use case is storage bound, CPU and memory load is low on ES nodes, and we don't notice load differences between two clusters.
-
All the custom cluster settings are kept the same in both v7 and v8 clusters except the
xpack.security.enabled
is explicitly set totrue
in the v8 cluster (which is not explicitly set in the v7 cluster but the default value is true). -
The following default settings are overwritten to make them the same as v7 cluster -
action.destructive_requires_name: false
cluster.routing.allocation.enforce_default_tier_preference: false
cluster.routing.allocation.type: balanced
http.max_header_size: 8kb
indices.query.bool.max_clause_count: 1024
indices.query.bool.max_nested_depth: 20
search.max_async_search_response_size: -1b
thread_pool.get.size: 8
thread_pool.snapshot.max: 4
transport.compress: FALSE
transport.compression_scheme: DEFLATE
-
Both the clusters have
cluster.routing.allocation.awareness.attributes: zone
and setes.search.ignore_awareness_attributes=false
in the v8 cluster to overwrite the default true value. -
Some indices in the v7 cluster have
default
value for theindex.codec
and some havebest_compression
, whereas all the indices in the v8 clusters havebest_compression
codec. -
We’ve added sorting on the creation time field in the v8 cluster indices. Some of the v7 cluster indices did not have this sorting.
-
Not sure if this is a clue or just noise, but we're seeing significantly higher network traffic (bytes sent/received are 3-5x times) in the v8 cluster. There's more cross AZ traffic in the v8 cluster as compared to the v7 cluster.
I can provide any extra details if relevant to this topic and could be helpful for investigation.