Threadpool/ Queue size limitation unsolved

I am using ES to do some data indexing in Windows OS. However, I have come across with the following errors always. It seems that it would be a queue size or threadpool size problem. However, I could not find any document that reveal how can I change the Windows settings to solve it.

[2016-07-20 11:11:56,343][DEBUG][action.search            ] [Adaptoid] [cpu-2015.09.23][2], node[1Qp4zwR_Q5GLX-VChDOc2Q], [P], v[42], s[STARTED], a[id=KznRm9A5S0OhTMZMoED0qA]: Failed to execute [org.elasticsearch.action.search.SearchRequest@444b07] lastShard [true]
RemoteTransportException[[Adaptoid][172.16.1.238:9300][indices:data/read/search[phase/query]]]; nested: EsRejectedExecutionException[rejected execution of org.elasticsearch.transport.TransportService$4@cd47e on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@9c72f5[Running, pool size = 4, active threads = 4, queued tasks = 1000, completed tasks = 1226]]];
Caused by: EsRejectedExecutionException[rejected execution of org.elasticsearch.transport.TransportService$4@cd47e on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@9c72f5[Running, pool size = 4, active threads = 4, queued tasks = 1000, completed tasks = 1226]]]
	at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:50)

Is there anyone who have experience with this?

Hi @Kennedy_Kan1,

this exception means that you are overwhelming a part of the system with tasks and as a countermeasure it rejects your request.

We have a couple of thread pools in Elasticsearch for different tasks. You can see from the message that it is the search thread pool you're overwhelming. It is configured to have 4 threads that do the actual work and a queue in front with a capacity of 1000 items which allows it to accept requests even if the worker threads are busy. Once the queue is full, you get these exceptions telling you that you're feeding the system too much and you should back off for a short time.

There are a few things you can do:

Elasticsearch has chosen 4 threads for this pool based on your current processor count. You can increase the number of threads in this pool (see docs). However, more threads do not mean that now you got magically more resources. So you now just have more threads competing for the same resources.

You can also use a beefier machine (if I got the maths right, you're on a dual-core machine) or add another node. Then you have more capacity.

Next, you can reduce the load by introducing throttling on the client or implementing a backoff mechanism on the client. We have implemented such a mechanism for bulk requests when you use the Java API (see BackoffPolicy.java on Github).

But there is one thing I'd advise against and that is increasing the queue size. If you think about it, there are already 1000 things waiting to be processed and if you increase this size, you're just deferring the problem but the system still has not more capacity to handle the load.

I hope that helps you better understand and resolve the issue.

Daniel

4 Likes