Should I increase my threadpool size if I get rejected executions or HTTP 429 responses?

Threadpools are described in the documentation as;

A node uses several thread pools to manage memory consumption. Queues associated with many of the thread pools enable pending requests to be held instead of discarded.

These threadpools are treated as short term, node based caches. And the size of these is dynamically calculated based on the number of processors, aka CPU threads/cores, when you start an Elasticsearch node on a host.

When a specific threadpool fills up it will start to reject requests, logging a rejected execution error in the logs of the node with the full threadpool, and returning a HTTP 429 error code to the client.

In this fantastic post from @jasontedor, he runs through why increasing/changing your threadpool settings may not be the best idea, even if it looks like it solves a problem.

Why is your queue filling up? Because your producers are adding work items to the queue faster than the consumers can take them off the queue.

What would happen if you increase the queue size? The producers will still be producing at the rate they are currently producing, and the consumers will still be consuming at the rate they are currently consuming. This means that the queue will still fill up. Except now, the queue is bigger which will put more memory pressure on the system. In fact, this means the consumers might slow down, which might exacerbate the production rate greater than consumption rate problem. From there, a vicious feedback loop will form.

Queues are helpful for handling variable load. They are not helpful for handing scenarios where the production rate exceeds the consumption rate. In this scenario, a queue filling up is a form of back pressure. It is telling you that you need to slow down the rate of production (or speed up the rate of consumption).

Only increase the queue size if you are in a situation where you have variable load that sometimes exceeds capacity. Do not increase the queue size if you are in a situation where production rate exceeds consumption rate.

Note - This post is a few years old and threadpools have had improvements over time, including the dynamic sizing mentioned above and in the docs. However we still maintain the same advice if you encounter this error/response.