Upgrade cluster to 7.9.2, master_not_discovered_exception

Hi,

This started out with Kibana not working, now seeing the root issue may be with my 3 node cluster no longer working.

I get this error (telnet to 9300 works on all three search nodes):

{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

It doesn't appear that I can force a master, not sure where to go from here. Put in a ticket yesterday with support, but usually the forum is quicker. Thanks in advance for any suggestions.

Sounds similar to my issue: upgraded from 6.8 to 7.9.2 yesterday. The cluster monitoring all looked good at the time, but this morning I'm getting the same error on any cluster API query. Interestingly, Kibana and the data indexes are available though.
Sorry, don't have a fix - also just raised a case with Elastic support.

1 Like

An update on this, turned out to be an issue with TLS on one of my 3 nodes being improperly configured. I had commented out the root CA reference awhile back, and also did an OS update with this Elastic update (a big no no). I think what happened was, the Root CA on the System dropped my CA, and with it commented out on the node, it couldn't find it. Certificate issues are quite the bear with elastic (the error I was experiencing and shared did not indicate TLS being the issue at all). Once I uncommented the line, the cluster healed itself and started working again.

Support was very responsive and helpful on this as well.

Not sure if my solution will help, but with that generic error message, worth looking into TLS configuration if you have it enabled :slight_smile: