Timed out while waiting for initial discovery state - timeout: 30s

I'm using Elastic Stack 6.4.0 (did rolling upgrade from 6.3.2 recently) and after stop of entire elasticsearch cluster (master, data, coordinate and ingest nodes), I'm no longer able to start it; I'm getting following messages in logs:

timed out while waiting for initial discovery state - timeout: 30s

and also:

org.elasticsearch.discovery.MasterNotDiscoveredException: null

I use Zen Discovery and publish_host seems to be correct.

Please advise.

after downgrading Elastic Stack to 6.3.2 (and later rolling upgrade to 6.4.0) everything works as it expected, however starting with 6.4.0 cluster Elastic Stack, causing nodes not see each other...

Can you share your full config and the logs?

My elasticsearch cluster consist of:

5 master eligible nodes
5 data nodes
2 injest nodes
and few coordinate only nodes

they all share following configuration:

# cat cluster.env 
cluster.name=X
# cat discovery.zen.env 
discovery.zen.minimum_master_nodes=3
discovery.zen.ping.unicast.hosts=esm1,esm2,esm3,esm4,esm5
# 

esm = master nodes

unfortunately, I blew everything away to go to plan "b", which I commented earlier...

Are they dedicated master nodes?

yes, and they have 12G HEAP memory

You really only need 3, 5 is a bit of overkill.

Without full logs it's a little hard to speculate as to why this is happening.

5 allows me to do rolling upgrades w/ minimal downtime

You can do the same thing with 3.
Having dedicated masters is great, but having more than 3 is diminishing returns.

I have following line:

discovery.zen.minimum_master_nodes=3

if I only have 3 nodes, and I restart one of them, cluster still green?

You just reduce that to 2, which is still a majority of 3.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.