Elasticsearch master pods failing master not discovered or elected yet, an election requires at least 2 nodes

gaurav671 · August 27, 2024, 5:54am

Hello, I'm facing an issue with my elasticsearch cluster. I have 15 pods running in multiple nodes on my kubernetes cluster. 3 elasticsearch master pod, 12 elasticsearch-data pods are running .
We are facing issue with Master pods failing as only 1 master pod is running and it is not identifying my remaining 2 pods and cluster is in async state.
Please find below logs

[2024-08-27T05:52:13,612][WARN ][o.e.c.c.ClusterFormationFailureHelper] [twamp-es-master-1] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{twamp-es-master-1}{iDt5Fo70ShKuNgEaopAtjQ}{TYFK2HM1TWW0i-0cQhCPIQ}{twamp-es-master-1}{192.168.14.47}{192.168.14.47:9300}{mr}{8.9.1}, {twamp-es-master-0}{k4_DXeaGQY6cjpOyMcqkgg}{ZBOZCjtMTTGKlr8SVXfeiQ}{twamp-es-master-0}{192.168.7.158}{192.168.7.158:9300}{mr}{8.9.1}, {twamp-es-master-2}{kBAN4EF7TzaUQmSsODP8aA}{jV7DVxERQLqba5TsMvsRfQ}{twamp-es-master-2}{192.168.18.23}{192.168.18.23:9300}{mr}{8.9.1}]; discovery will continue using [192.168.18.23:9300, 192.168.7.158:9300] from hosts providers and [{twamp-es-master-1}{iDt5Fo70ShKuNgEaopAtjQ}{TYFK2HM1TWW0i-0cQhCPIQ}{twamp-es-master-1}{192.168.14.47}{192.168.14.47:9300}{mr}{8.9.1}] from last-known cluster state; node term 0, last-accepted version 0 in term 0; for troubleshooting guidance, see https://www.elastic.co/guide/en/elasticsearch/reference/8.9/discovery-troubleshooting.html
at org.elasticsearch.server@8.9.1/org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:316)
        at org.elasticsearch.server@8.9.1/org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:355)
        at org.elasticsearch.server@8.9.1/org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:293)
        at org.elasticsearch.server@8.9.1/org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:642)
        at org.elasticsearch.server@8.9.1/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:916)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1623)

Christian_Dahlqvist · August 27, 2024, 6:36am

Do all the master nodes have persistent storage? How are they configured?

gaurav671 · August 27, 2024, 6:40am

Yes all the master node has persistent storage . we actually wants to expand this nodes capacity hence we deleted the pvc and it restarted with updated expanded capacity. But again after that master pods giving us error,
"master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node:"

Christian_Dahlqvist · August 27, 2024, 8:48am

It looks to me like the cluster state that was saved on disk which contained the information about other nodes have been lost and you have apparently removed the initial master nodes config from the config (as recommended). I suspect you need to set the cluster up again as a new cluster, which will cause data los unless you have snapshot available.

gaurav671 · August 27, 2024, 11:05am

Any solution we have to recover data if we have shards available in pv pvc , we dont have snapshots available

Christian_Dahlqvist · August 27, 2024, 11:24am

I believe the elasticsearch-node utility was designed for this type of unsafe recovery, but I have no idea how you would be able to use this with a cluster on Kubernetes. As I believe you have lost all master nodes it is likely the data is lost and can not be recovered.

leandrojmp · August 27, 2024, 12:49pm

I do not know much about kubernetes, but isn't the pvc the persistent storage? If you deleted it, you deleted the data for the node.

The error is consistent with elasticsearch starting with an empty data dir.

system · September 24, 2024, 12:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Master not discovery exception Elastic Cloud on Kubernetes (ECK)	8	7358	November 4, 2022
Getting “master not discovered or elected yet” causing cluster not up in version 7.9.1 Elasticsearch	21	4185	November 7, 2020
Master not discovered exception with ELK 7 Elasticsearch	9	5512	August 4, 2019
ElasticSearch not able to discover Master nodes Elastic Stack	3	1692	November 4, 2022
Master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes Elasticsearch	3	1043	August 5, 2019

Elasticsearch master pods failing master not discovered or elected yet, an election requires at least 2 nodes

Related topics