We are seeing some very strange query performance on most of our indices when replicas are enabled. Since we require high availability and all our data nodes are directly on local SSD drives we use 2 replicas for all indices.
When firing very complex queries (5K+ lines pretty JSON) against a medium sized index we are seeing 20-30 second response times and timeouts. When disabling all replicas, the response time drops to 3.5 seconds consistently.
Our setup:
Elasticsearch 6.2.4
8 data nodes on local SSD, 18GB heap, 10 CPU each. Across 4 physical hypervisors
Index mapping with 300 fields, 20 nested.
5M docs in index. ~100GB in primary shards.
16 shards
Query has many dis_max parts, aggregations, sorting and highlighting. No filters.
Why does having replicas have such a big impact on the query performance? Whether we have 161 shards or 163 shards should not impact the fact that we should query 16 unique shards across all nodes (avg 2 per node).
The cluster is pretty much idle before and during the query. No noticeable load, CPU or disk usage on any node. All nodes have the same hardware and performance.
What I've tried already:
Reduce the number of shards to 8 or 4: This actually increases the response time, without replicas enabled. With replicas I am seeing the same extreme query times (20-30+ seconds + timeouts).
Tweaking the query. Removing aggregations, sorting, etc. parts. Did seem to affect overall response times, but the major difference between 0 and 2 replicas remains
Profile the query for different shard / replica setups. No clear part stood out as the reason for the difference. But too much data to go over manually. Any good tools to analyze the Profile response?
When a shard is held on only one node (ie one primary, no replicas) any caching (eg file-system level or segment level) will make returning queries faster.
When you have replicas your queries will be randomized across replicas, some of which will have empty caches for your query and others warmed.
You can use a client session ID as the custom value in the preference setting to route queries back to the same replica where possible to gain the benefits of a warm cache
So in general, as your replicas go up, you should also increase memory to account for the fact that the same shard data is cached in multiple nodes?
Are there any other ways to increase query performance for these very big queries? Even when providing so much memory to be able to read the entire shards into memory, I am still seeing seconds of query time.
Replicas will help with serving concurrent users' search loads but of course only if you give them the compute resource they demand (spindles, cores and yes, memory).
Session-based routing will ensure you're making the best use of that memory though because it returns a user to the same choice of replica and increases the possibility of cache-hits (user sessions typically repeat queries when hitting "next page" etc)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.