I have around 14 indices with 5 primary and 1 replica shard each in this node.
Each index has a size of 40-50 GB. These indices are used to store data for a particular day and we have search queries(mainly term aggregations) which allow querying from the current date to 2 weeks back.
When querying for 2 weeks long data, we get this exception ---
{
"type":"es_rejected_execution_exception",
"reason":"rejected execution of [org.elasticsearch.transport.TransportService$7@6f7799fa](mailto:org.elasticsearch.transport.TransportService$7@6f7799fa) on EsThreadPoolExecutor[ **search** , **queue capacity = 1000** ,
[org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@6c0d64d2[ **Running** ](mailto:org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@6c0d64d2[Running) **, pool size = 25, active threads = 25, queued tasks = 3410, completed tasks = 1538807]]** "
}
Que.1: Why are we getting so many tasks?
Que.2: Will Increasing queue size be helpful?
Que.3: What configuration changes may help if we can't reduce the number of requests to ES?
There are multiple queries simultaneously sent by multiple clients. When querying for less than 1 week data queries take about 60-80 seconds. For 2 weeks data, we hit this exception. So we don't know the response time.
If you have slow storage and queries are piling on, queues tend to fill up. I would not be surprised if you saw dramatic improvement if you switched to SSDs as these tend to handle random disk I/O a lot better than HDDs.
I will try to explain the scenario, we are sending at most 10 parallel search queries(by diving the 2 weeks time in 10 buckets). Pasting the sample query we are using :
Que.1 How does Elasticsearch divide these requests into tasks internally? I didn't find any resources regarding this.
Que.2 How can we optimize the query to get similar data in less time? Thinking of using scroll instead of aggregations..which is better?
I'm fairly new to ES so pardon me if I have asked any stupid questions.
Are you seeing any evidence in the logs of long or frequent GC? You are specifying very large size parameters for your terms aggregations, which can lead to a lot of unnecessary memory usage. I would recommend tuning this query to try and make it as efficient as possible as you run it frequently.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.