After a cluster restart, both primary and replica shards become unallocated after a long time

whzcl · February 25, 2016, 1:18pm

Version: 1.5.2
Data size: 60GB
Shard number: 30 primary, 30 replica
These values are default.
gateway.recover_after_nodes
gateway.recover_after_time
gateway.expected_nodes

Description: We have 12 nodes: 3 master nodes and 9 data nodes. In order to uninstall some plugins, we have to do the cluster restart. We do the following steps:

Uninstall the plugin and shutdown node one by one (We didn't disable shard reallocation).
As a result, all the nodes are shutdown.
Start the master nodes one by one.
Start the data nodes one by one.
However, after about an hour, some primary and replica nodes are still unallocated.
The error logs are as followed:

屏幕快照 2016-02-25 下午9.16.31.png951×895 684 KB

So we have to reroute the unallocated shards using the following api. However, after rerouting the shards, the data in these shards are lost.

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{ "commands" : [ { "allocate" : { "index" : "t37", "shard" : $shard, "node" : "datanode15", "allow_primary" : true } } ] }'

Therefore, my questions are as followed:

Why primary shards become unallocated after a long time?
How to do the cluster restart correctly and safely?
If some primary shards become unallocated unfortunately, how to reroute them without losing data?

Thank you.

warkolm · February 26, 2016, 3:08am

That's too high, who so many?

You should ideally follow this procedure - Rolling Restarts | Elasticsearch: The Definitive Guide [master] | Elastic

whzcl · February 26, 2016, 5:33am

Yes, I know this solution. However, we don't restart node one by one(rolling restart). What we have to do is that we shutdown every node one by one. And then all the nodes were shutdown. Then we start every node one by one. In this case, the solution will not work: https://www.elastic.co/guide/en/elasticsearch/guide/master/_rolling_restarts.html

The reason why allocate such shards is that we may extend to more data nodes in the future

Topic		Replies	Views
Perma-Unallocated primary shards after a node has left the cluster Elasticsearch	2	536	July 6, 2017
Restarting one of the nodes resulted in unassigned shards Elasticsearch	4	2660	July 6, 2017
Shards unassigned after node restarts - reason: NODE_LEFT Elasticsearch	16	37497	December 28, 2016
Shard recovery with only one node in the cluster Elasticsearch	3	656	July 6, 2017
Quick recovery after node restart in elasticsearch Elasticsearch	5	2299	July 6, 2017

After a cluster restart, both primary and replica shards become unallocated after a long time

Related topics