Increasing Threadpool Queue Sizes

I was wondering if anybody had any experience with increasing bulk and index queue sizes. I am working on an application that encounters large bursts of index requests that have so far been managed by increasing the queue sizes, as we can handle slower response times.

We have not seen any major issues with resource saturation or garbage collection, but I wanted to know what kinds of problems we could potentially run into. Does anybody know what problems or side effects can be caused by increasing bulk/index queue sizes?


Threadpools, when being used, are stored in memory. So increasing the size and use of them puts pressure on heap, and all the impact that can have.

Thanks for the reply, that is good to know. I'll keep an eye on the JVM memory usage to make sure it isn't causing any problems.

It's best to not make the thread pools larger than the number of logical CPUs that Elasticsearch has available, otherwise the threads are just competing for compute resources. In future versions of Elasticsearch you will not be able to increase these thread pools past a certain size (the minimum of 32 and the number of logical CPUs) so you shouldn't rely on this to solve problems.

It's possible, but not yet certain, that in future versions of Elasticsearch the maximum queue sizes are going to be restricted too. Overly large full queues are a danger to a healthy server.

That is good information! We are not changing the pool sizes at all, only the queue sizes, but that is a good reason to leave the pools alone.

It is this in particular that I am interested in. What exactly is the danger posed by having large queues? Is it strictly an issue of memory being used by many large queued requests, or is it more complicated than that?

The dangers are manifold. First, you said it, the memory allocation. This means the sender of the queued threads considers the job done while it is not. Second, latency is all when it comes to performance. Hanging jobs in long queues increase the latency of a system from a client perspective when synchronized responses are required. ES is asynchronously designed for that reason to gain low latency but this does not come for free. Increased memory usage comes with increased garbage collections and unnecessary interaction with the OS on I/O layer. The key word is "back pressure" : imagine lots of slow clients that sit there and wait for responses on requests. Scan/scroll is such an example where large amount of data is involved and ES may produce higher data rates than can be consumed. ES must keep open all the references to such slow clients, and this can escalate until the system steps over. Slow clients can kill very fast servers.

One strategy is to bring the server/client into a dynamic balance. For example, this could be done by reactive streams - bidirectional streams that can adjust themselves regarding to the current capacity of the client or the server.

1 Like