Bulk inserts not balanced

apecoraro · February 6, 2017, 6:40pm

I have a 5 node cluster with 5 primary and 1 replica shard. I have a multi-process system for inserting documents into my index. Each instance of the inserter randomly selects a node from the ES cluster to send bulk insert requests to in order to balance the load between all nodes in the ES cluster. However, despite that it seems like one ES node seems to get stuck with the majority of the work - when I look at _cat/thread_pool/bulk one of the nodes has all its bulk threads active with a large backload of requests in the queue and the other four nodes only have 2 or 3 active bulk threads and no backlog.

Why would this happen?

spinscale · February 7, 2017, 8:16am

Hey,

are your shards and documents evenly distributed across the cluster? Is each node having two shards and do those shards contain the same number of documents?

--Alex

apecoraro · February 7, 2017, 5:13pm

Yes, the shards are balanced. My application makes sure that after creating the index that each node gets one primary shard using the reroute API. I have a five node cluster and my indexes are being created with 5 primary shards.

system · March 7, 2017, 5:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bulk insert requests not balanced across cluster Elasticsearch	4	579	July 14, 2017
ES v5.4.0 Bulk Requests Rejection Elasticsearch	3	481	November 15, 2018
Unbalanced primary shards affects index performance? Elasticsearch	10	2110	December 20, 2017
Balanced shards and replicas in ES Elasticsearch	3	446	July 6, 2017
Primary shards not balanced across nodes for elasticsearch 5.2 Elasticsearch	2	739	August 23, 2018

Bulk inserts not balanced

Related topics