Master node discovery is not working

LogBabel · January 31, 2018, 7:28pm

Hello all,

Here's another problem with our cluster I cannot understand. When a node (node1) drops from the cluster the other nodes continue trying to use node1 as master. For some reason the master reelection is not taking place. And when node1 is restarted the cluster does not heal, the other nodes continue with error messages such as "failed to send health to master node node1, node not connected".

This started because one of our data nodes (node6) had an OOM issue and killed the ES process. (The OOM was likely due to too many recoveries occurring while indexing). After restarting the node6 it failed to discover the master. Additionally, any queries to the rest of the cluster failed. I decided to restart the active master node (node1) to trigger rediscovery but this did not work as expected. Now it seems once again a full cluster restart is the only option to recover.

I'm not sure what is wrong. The config is fine and has worked for ages. IPTABLES allows all cluster traffic. There are 8 nodes, 6 are data and 2 are ingest only. 3 of the data nodes are set to master. Minimum master nodes is set to 2. Zen discovery is using unicast.

This is ElasticSearch 5.6.3.

LogBabel · January 31, 2018, 8:59pm

I couldn't figure this out and assumed it might have been a bug triggered by an unsafe shutdown (the OOM kill).

It seemed a good time to upgrade to 6.1.3 so I've done that. This thread is probably no longer relevant due to the upgrade.

system · February 28, 2018, 8:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch 6.1.3 -- failed to discover master after node restart Elasticsearch	6	1240	April 27, 2018
Shutdown node cannot re-join the cluster Elasticsearch	2	297	July 6, 2017
Master node can not be elected After old master process got killed Elasticsearch	3	1369	July 6, 2017
Master nodes do not detect the other masters after service restart Elasticsearch	10	5078	August 21, 2019
Node can't find cluster after restart - discovery failed [SOLVED] Elasticsearch	3	2033	July 5, 2017

Master node discovery is not working

Related topics