Shard reallocation after rolling restart

alsyia · May 31, 2017, 1:22pm

Hi there,

I just performed a rolling restart of my cluster (I needed to update a plugin), following the doc. I disabled shard allocation, restarted one node, waited until it joined the cluster, reenabled allocation, waited cluster state to be green, and repeated the process on each node.

However, I was a bit surprised to see some shards being moved from one node to another after re-enabling allocation.

I thought shards were simply going to be reassigned on the node that restarted since data was already on the disk. But some of them were moved...

What is the reason for that ? Can we do something against it ?

Thank you.

abeyad · June 2, 2017, 2:16pm

Hello,

Were you still indexing documents during the rolling restart process? If so, when you stopped a node (assuming you have replicas in addition to primary shards), the replica for some shards would have gone offline and if the node held primary shards, then the replicas for that shard in the cluster will have been promoted to primary. Now, when the node rejoins, if you were indexing, then Lucene could have merged segments, rendering your underlying segments files completely different from the shard data on the node that left the cluster to be upgraded. When that node comes back, none of the files for the shard data are the same (even though they both contain many of the same documents), so Elasticsearch does not see any reason to favor the rejoined node for allocating that shard over any other node in the cluster.

Here are some guidelines for making the upgrade process as smooth as possible: https://www.elastic.co/guide/en/elasticsearch/reference/current/rolling-upgrades.html

Lastly, the above problem I described will go away once sequence numbers are introduced (expected for 6.0), which will allow recovery to be based on missing index operations as opposed to file-based recovery.

alsyia · June 2, 2017, 2:32pm

Hi,
I am not sure if the cluster was in use or not, so it's totally possible that what you describe happened. Makes sense.

Thank you for your nice answer. I'm looking forward ES6, then !

system · June 30, 2017, 2:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rolling restart elasticsearch cluster Elasticsearch	5	1826	July 5, 2017
ES Constantly reballancing after restart Elasticsearch	8	1549	July 5, 2017
Elasticsearch rolling restart problem Elasticsearch	5	407	July 6, 2017
Shard placements question after rolling restart Elasticsearch	4	669	November 18, 2017
Stopping a cluster to balance shards when restarting a node Elasticsearch	2	1182	July 5, 2017

Shard reallocation after rolling restart

Related topics