Unnassigned Shards After Node Restart

johnpaullarsen · August 2, 2016, 6:59am

We have a 3 node ES cluster (2.3.3) with 4 cores and 16G RAM. The config on each node includes:
gateway.expected_nodes: "3"
gateway.recover_after_nodes: "2"
5 shards per index

Last week while doing maintenance I did the following (this is a development cluster):

Set cluster settings: "cluster.routing.allocation.enable": "none"
Rebooted 2 nodes
Set cluster settings: "cluster.routing.allocation.enable": "all"

What I expected is that the remaining node would pause while the two nodes were down. Then once the nodes had restarted that the cluster would come back up and recover.

What in fact happened is that the cluster came back up (indicating there was 3 nodes) and for a while showed a positive count of initializing shards. Eventually initializing shards became zero but only 66 percent of shards were active. _cat/shards indicated that many shards were "unassigned". Cluster state is red.

I suspect what I did may not have been a good idea, I should have taken one node down at a time. But I don't understand why the rebooted nodes would not have been able to restart the shards they owned. Its a cause for concern because for logistical reasons we have to run two of our nodes in a single data center.

Can anyone shed some light on what could have happened?

Thanks

warkolm · August 2, 2016, 10:41am

You'd need to dig into your logs.

But, did you have minimum masters set?

johnpaullarsen · August 3, 2016, 12:08am

Yes, I should have mentioned minimum masters was set to 2.

The logs show nothing that I would regard as unusual in the circumstances...

On remaining node
"not enough master nodes", then eventually other two nodes join again

On rebooted nodes
Usual startup logs
A couple of exceptions because they can't find some groovy scripts (I don't imagine that would matter)
Detection of remaining node as master

Topic		Replies	Views
Shards unassigned after node restarts - reason: NODE_LEFT Elasticsearch	16	35714	December 28, 2016
Rebuild Cluster - Unassigned shards Elasticsearch	4	770	December 15, 2016
Why shard unassigned after cluster restart completely? Elasticsearch	1	385	May 28, 2020
ES Cluster Recovery and Restart Elasticsearch	3	586	July 6, 2017
Elasticsearch crash unassigned all shards Elasticsearch	4	481	July 21, 2019

Unnassigned Shards After Node Restart

Related topics