During maintenance all writes on new shards targeted one node leading to document rejections

ssouris · March 29, 2021, 5:43pm

Hi Elasticsearch experts

During one of our maintenances of the cluster,
we needed to restart a node to add more disk space to it.
So we made sure no shards were located on that node, and we performed the maintenance. When we turned it back up, Elasticsearch started relocating shards to it, which made sense.

Then we observed an issue, where at 0:00am, indices were rolled over, and Elasticsearch decided to put the new shards to that node. Then immediately all write traffic was targeting that node, leading to high CPU and write thread pools filling up, and rejecting documents.

I was curious, is there any good way in Elasticsearch to avoid, for the cluster to allocate "hot" shards on a single node?
I understand this could happen regardless of the maintenance, so I was interested to understand how the community solves this issue.

Thanks in advance.

whatgeorgemade · March 30, 2021, 9:26am

That does seem like a mis-allocation. Do the indices that rolled over have a single shard?

ssouris · March 30, 2021, 9:31am

Thanks for taking the time.
They have the same number of shards as the existing indices of that daily index which was 5.

system · April 27, 2021, 9:32am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.