Hi Elasticsearch experts
During one of our maintenances of the cluster,
we needed to restart a node to add more disk space to it.
So we made sure no shards were located on that node, and we performed the maintenance. When we turned it back up, Elasticsearch started relocating shards to it, which made sense.
Then we observed an issue, where at 0:00am, indices were rolled over, and Elasticsearch decided to put the new shards to that node. Then immediately all write traffic was targeting that node, leading to high CPU and write thread pools filling up, and rejecting documents.
I was curious, is there any good way in Elasticsearch to avoid, for the cluster to allocate "hot" shards on a single node?
I understand this could happen regardless of the maintenance, so I was interested to understand how the community solves this issue.
Thanks in advance.