We are seeing some EsRejectedException in the form of "rejected execution of ... on EsThreadPoolExecutor[search, queue capacity = 1000, ... [Running, pool size = 4, active threads = 4, queued tasks = 1000, completed tasks = 128865868]]]
The reason of the exception is clear, we have only 4 search threads and all of them were busy for long enough to have 1000 queries accumulate and thus queries are rejected by Elasticsearch to not grow the queue endlessly.
What I would like to find out is what the "work-item" in the threads and queue actually indicate. Is every item one full query, which can access many different indices and shards? Or is each item in the queue actually an access to a single shard.
This is important information, because in the first case, one query which takes forever cannot block all available threads, but in the second case it can, because the query would typically spread out into a number of shard-level-queries.
I read documentation at https://www.elastic.co/guide/en/elasticsearch/reference/6.2/modules-threadpool.html and the blog post at https://qbox.io/blog/thread-pools-elasticsearch-search-request-errors, but that does not seem to answer this.
I would think it will be the second case, i.e. each accessed shard uses up one item, but I a more definite answer would be very nice, indeed.
Anybody knows or where in the code would I be able to find out?