Rolling restarts issue

Michael1 · July 21, 2016, 4:33pm

Hello,

I made a change in elasticsearch.yml (disabling Shield for testing). First, I executed the following PUT on one of the master nodes directly ::

{
    "transient" : {
        "cluster.routing.allocation.enable" : "none"
    }
}

I then rebooted the node, but now I am stuck and do not know if the master node has joined the cluster already...there are 3 master nodes total.

I tried changing 'none' to 'all' and running it against that node but now I am getting nothing but 'master_not_discovered_exception' 's in the response to the PUT request. The cluster seems to be down at this point.

Not sure what to do now...

jpcarey · July 21, 2016, 4:41pm

On one of the master nodes that was not restarted, curl _cat/master or _cat/nodes. I would also take a look at the log file on the node that you restarted, and see why it was not able to join the cluster.

You should not enable allocation on a rolling upgrade until the node has rejoined the cluster, it will cause the shards to be redistributed evenly amongst the current nodes in the cluster.

https://www.elastic.co/guide/en/elasticsearch/reference/current/cat.html

Michael1 · July 21, 2016, 4:48pm

Well, I said whatever and just rebooted all of the master and data nodes at the same time, now it is back. Thankfully, this was not a production environment, but if it was, this would be a problem.

I followed the instructions for rolling restarts as it said, and this is a very basic cluster, so it is relatively unacceptable to me that nothing is mentioned about looking out for things like this.

Michael1 · July 21, 2016, 4:52pm

Also, I tried looking at the log already (in /var/log/elasticsearch). The only log file that is updated today only contains an error regarding getting the cluster health due to Shield expiration. I assume this would not affect a node being able to join a cluster, correct?

I see nothing else besides that in the log file.

jpcarey · July 21, 2016, 4:59pm

No, the behavior of each plugins license expiration is documented. https://www.elastic.co/guide/en/shield/current/license-management.html#license-management

It is hard to say what exactly went wrong. You might find this article helpful in understanding your cluster health. However, with the shield license expired, cluster health, cluster stats, and index stats APIs are blocked. If you have a support subscription, you should reach out directly to get your license situation resolved and to further troubleshoot what went wrong in your cluster.

Topic		Replies	Views
Restarting many nodes Elasticsearch	3	278	July 19, 2018
Node not getting shards allocated back to it after upgrade Elasticsearch	6	567	November 4, 2022
Restarting a cluster node Elasticsearch	2	270	November 16, 2020
Unnassigned Shards After Node Restart Elasticsearch	3	518	July 5, 2017
Rolling restart Elasticsearch	5	396	July 6, 2017

Rolling restarts issue

Related topics