Restarting many nodes

Hi team, my first post here :slight_smile:
Working as Sys Admin = do everything

The scenario:
ES 5.6.0
6 data nodes
Adding 3 master nodes and 2 client nodes (coordinating nodes).

shards size up to 130 GB and cluster about 800GB.
SSD disks in raid0, network 1Gbit.

changed elasticsearch.yml on data nodes.

Restarted node 1 - cluster went to yellow - initializing and unassigned shards.
After about 1 hour the cluster was green again.

Found on forum suggestion to use:
cluster.routing.allocation.enable "none"
restart node 2
cluster.routing.allocation.enable "all"

That did not help, initiating is still ongoing.

I still have 4 more data nodes to go, after that one more cluster. That will take me about 12 hours!

Thanks for advice?

Hi, might be a silly question, but did you wait for the restarted node to rejoin the cluster and be available before you cluster.routing.allocation.enable "all"?

I have had no major problems doing rolling restarts on my cluster with 20 nodes and quite a few TB of data. Largest shards have been 200GB.

My cluster is "rack aware" so I set cluster.routing.allocation.enable "none", restart ES on all nodes in one "rack", wait until they are all showing up again in the list of nodes and then cluster.routing.allocation.enable "all". It usually takes less than a minute for the cluster to go from yellow back to green.

Not sure if introducing new nodes causes some sort of re-balancing of shards....

Thank for advice A_B
Restarted node3 10 min ago.
After node3 joined cluster waited extra 3 min.

Actually INITIALIZING is taking time.

The replica shards on node3 are all started.
The primary shards from node3 are now INITIALIZING as replica on node3 (two at the time). During the process network speed is max on node3 coming from node that now have primary shards.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.