Number of active threads for bulk thread_pool is equal to number of shards to which write is happening and not a single bulk request

vineeth_mohan_2 · July 12, 2018, 10:06am

Number of active threads for bulk thread_pool is equal to number of shards to which write is happening and not a single bulk request.

My understanding was that , active threads in thread_pool = number of concurrent bulk requests.
But seems this is equals to the number of shards to which write happens.
We have a bulk where we are writing to multiple indices in a single bulk request and it seems to exceed active threads + queue size and finally hitting rejection , even though we have only 1 bulk request in parallel.

Is my observation wrong or is Elasticsearch working like this ?

vineeth_mohan_2 · July 12, 2018, 10:19am

Seems that is how Elasticsearch works and it is well documented - https://www.elastic.co/blog/why-am-i-seeing-bulk-rejections-in-my-elasticsearch-cluster

But what should be done in instance where I have a bulk which writes to multiple indices ? Number of shards can be of very huge number.

Christian_Dahlqvist · July 12, 2018, 10:20am

What is the use case? How many indices and shards do you have? What is your sharding strategy as you end up writing to so many shards in a single bulk request?

If you have a very large number of shards in your cluster, you may also benefit from this blog post.

vineeth_mohan_2 · July 12, 2018, 10:37am

Well , lets assume I have like 50 indices with 5 shards each. Inside my bulk request , each request can go to a different index , which means that number of active threads = concurrency Multiplied by ( number of shards to which write happens ) which could be 1 * 250 ( worst case , I am writing to all shards ).
Some of the index requests will fail for sure.

Christian_Dahlqvist · July 12, 2018, 10:39am

If all shards are on a single node that could end up exceeding the queue size. If you can reduce the number of shards per index this would improve. Do you need 5 shards per index?

system · August 9, 2018, 10:40am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bulk queue_size Elasticsearch	9	12703	July 5, 2017
Bulk indexing rejected threads Elasticsearch	14	767	April 13, 2020
How does bulk API utilize active threads of write thread pool? Elasticsearch	4	462	July 11, 2019
ES rejecting bulk messages when writing to indices with 40 shards Elasticsearch	12	5706	March 22, 2019
High Rejections - bulk api Elasticsearch	10	1361	February 20, 2020

Number of active threads for bulk thread_pool is equal to number of shards to which write is happening and not a single bulk request

Related topics