I am using synchronous get with high level rest client. My application is having 40 threads, but after running for some time, I am able to see request rejected exception because queue size limit is 1000 for get request.
My question is if I am using 40 threads for synchronous get, why are 1000 get requests in pipeline. There are no other application using this index.
Is it something to do with rest client async call back?
When it receives a search query Elasticsearch does a lot of things under the hood. Typically it will start one worker thread per shard it needs to query, to search them in parallel. So if there are N shards it needs to query on a node, it will try to start N search threads. When there are no vacant worker threads left (because all are busy) new search requests are pushed on the queue.
This means that if a node has many shards or if the search traffic is heavy you may risk filling up this queue (max 1000) and then you'll start to get rejections.
The best way to reduce this problem is to limit the number of shards hit by the search request, you can do that in three ways:
Restrict the query to match only relevant indices, i.e. instead of GET _all/_search use GET my_index/_search if possible.
Minimize the number of shards per index by aiming for shard sizes of 20-40GB.
Limit the total number of shards per node, add more nodes to spread the shards when too many.
So what is too many shards? Well, according to an official blog post you should have less than 20 shards per GB of heap space, so if you have 30 GB of Java Heap Space on each data node you should try to stay below 20 * 30 = 600 shards (primary + replica) on each node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.