Increasing Threadpool Queue Sizes

kbutler · January 20, 2016, 8:48pm

I was wondering if anybody had any experience with increasing bulk and index queue sizes. I am working on an application that encounters large bursts of index requests that have so far been managed by increasing the queue sizes, as we can handle slower response times.

We have not seen any major issues with resource saturation or garbage collection, but I wanted to know what kinds of problems we could potentially run into. Does anybody know what problems or side effects can be caused by increasing bulk/index queue sizes?

Thanks,
Keith

warkolm · January 20, 2016, 9:12pm

Threadpools, when being used, are stored in memory. So increasing the size and use of them puts pressure on heap, and all the impact that can have.

kbutler · January 22, 2016, 8:59pm

Thanks for the reply, that is good to know. I'll keep an eye on the JVM memory usage to make sure it isn't causing any problems.

jasontedor · January 23, 2016, 3:37pm

It's best to not make the thread pools larger than the number of logical CPUs that Elasticsearch has available, otherwise the threads are just competing for compute resources. In future versions of Elasticsearch you will not be able to increase these thread pools past a certain size (the minimum of 32 and the number of logical CPUs) so you shouldn't rely on this to solve problems.

It's possible, but not yet certain, that in future versions of Elasticsearch the maximum queue sizes are going to be restricted too. Overly large full queues are a danger to a healthy server.

kbutler · January 25, 2016, 4:12pm

That is good information! We are not changing the pool sizes at all, only the queue sizes, but that is a good reason to leave the pools alone.

It is this in particular that I am interested in. What exactly is the danger posed by having large queues? Is it strictly an issue of memory being used by many large queued requests, or is it more complicated than that?

jprante · January 25, 2016, 10:17pm

The dangers are manifold. First, you said it, the memory allocation. This means the sender of the queued threads considers the job done while it is not. Second, latency is all when it comes to performance. Hanging jobs in long queues increase the latency of a system from a client perspective when synchronized responses are required. ES is asynchronously designed for that reason to gain low latency but this does not come for free. Increased memory usage comes with increased garbage collections and unnecessary interaction with the OS on I/O layer. The key word is "back pressure" : imagine lots of slow clients that sit there and wait for responses on requests. Scan/scroll is such an example where large amount of data is involved and ES may produce higher data rates than can be consumed. ES must keep open all the references to such slow clients, and this can escalate until the system steps over. Slow clients can kill very fast servers.

One strategy is to bring the server/client into a dynamic balance. For example, this could be done by reactive streams - bidirectional streams that can adjust themselves regarding to the current capacity of the client or the server.

Topic		Replies	Views
Thread Pool Bulk Queue Overflow Elasticsearch	5	2013	July 5, 2017
How does Elasticsearch use ThreadPool bulk queues in a cluster? Elasticsearch	3	1641	July 5, 2017
Setting thread_pool.index.size in es 5 Elasticsearch	3	5758	December 26, 2016
ThreadPool Setting's for bulk indexing in elasticsearch.yml Elasticsearch	5	8662	July 5, 2017
Increasing thread pool / queue size Elasticsearch	3	622	July 6, 2017

Increasing Threadpool Queue Sizes

Related topics