Rolling restart elasticsearch cluster

I have an index with 7 primaries and 2 replicas. Each shard is on its own machine. So, there are 21 data nodes.

When I have to do rolling restart, currently, I check if shard relocation is complete before I switch over to restart another node.

http://$MARVEL_HOST/_cluster/health?wait_for_status=green&wait_for_relocating_shards=0&timeout=15m

Do I need to wait for shard allocation to complete or should I simply check if cluster status is green and then move over to the next node?

In addition, do I also need to worry about initializing shards before restarting another node?

Say, I only wait for cluster to become green and try to restart a node that has a primary and its replica shard is still in the process of relocating. Will I run into any issues? How does elasticsearch handle such a scenario?

I am using version 1.7.0.

When I have to do rolling restart, currently, I check if shard relocation is complete before I switch over to restart another node.

Why are shards being reallocated? The recommendation is to disable allocations while a node is being restarted and enable it once it's up again, so it should return to green very quickly without any additional reallocations.

Do I need to wait for shard allocation to complete or should I simply check if cluster status is green and then move over to the next node?

Since you have two replicas of each shard you don't even have to wait for the cluster to become green since you can handle two unavailable nodes at a time. But sure, taking them down one at a time and only when the cluster is green means you could suffer an unplanned node loss during the restart without venturing data availability.

Why are shards being reallocated? The recommendation is to disable allocations while a node is being restarted and enable it once it's up again, so it should return to green very quickly without any additional reallocations.

Right... before I shutdown, I do disable shard allocation by executing the command:

curl -s -S -XPUT http://$MARVEL_HOST/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable":"none"}}'

and once the node has restarted, it allocation is enabled again by executing the command:

curl -s -S -XPUT http://$MARVEL_HOST/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable":"all"}}'

Once, allocation is enabled, I see that shards start getting re-allocated. I have seen that cluster goes green but the shards are still being relocated to different machines. And that's where my question is: Do I need to wait for the shard relocation to complete before I move onto the next node or not?

I am following rolling restart guidelines as specified here: Rolling Restarts | Elasticsearch: The Definitive Guide [2.x] | Elastic.

Once, allocation is enabled, I see that shards start getting re-allocated. I have seen that cluster goes green but the shards are still being relocated to different machines.

So the reallocation starts after the cluster goes green when you've restarted a single node? That's not what I would've expected—that it thinks there's a need to reallocate, that is. Unless things have happened while the node was down the previous shard equilibrium should remain.

And that's where my question is: Do I need to wait for the shard relocation to complete before I move onto the next node or not?

You don't have to wait.

So the reallocation starts after the cluster goes green when you've restarted a single node? That's not what I would've expected—that it thinks there's a need to reallocate, that is. Unless things have happened while the node was down the previous shard equilibrium should remain.

I think I understand now. Yes, when the node was down, indexing was still going on. Thus, when the node restarts, the shards get re-allocated on enabling reallocation setting.

I have made appropriate changes that confirms that cluster is green before restarting the next node.

Thank you for your help, Magnus.