Search requests still get rejected at the same number of queued requests despite having increased `queue_size`

We have a cluster of five nodes. For a lot of processing, our system sends search requests to Elasticsearch 7.17 in bursts, which fill up its queue_size and Elasticsearch will start rejecting requests.

However, our system is set up such that, although there are multiple instances running of the service that sends these requests in bursts, each of these still has to wait for a burst to finish before sending a new one. Moreover, the individual requests in these bursts do finish rather quickly. Therefore, the number of requests cannot grow unboundedly.

Hence, the solution appeared simple, simply increase the queue_size and let Elasticsearch process these when it is ready to do so. So we did this and verified it using GET /_cat/thread_pool/search?v=true&h=id,node_name,name,active,queue,rejected,completed,host,largest,pool_size,queue_size,type, but somehow requests still get rejected at the exact same moment, i.e. when there are 1000 requests in the queue. Why is this happening and what can I do about it?

I'm not quite sure why, but increasing min_queue_size to make it once more equal to queue_size solved this problem.

