Master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster

[2019-04-11T01:26:41,474][WARN ][o.e.c.c.ClusterFormationFailureHelper] [d-gp2-es46-1] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [d-gp2-es46-1., d-gp2-es46-2., d-gp2-es46-3.] to bootstrap a cluster: have discovered [{d-gp2-es46-2}{H2bib1wCSBKGu_Ku_4DgjA}{rzokY9nmRDCBNz0lBMgUYw}{<.....>.166.183}{<.....>.166.183:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-es46-3}{8KNzmk5uS2mZSZiftNVTDQ}{exvQChr7RPyDlkuJ-FT2Rg}{<.....>.165.141}{<.....>.165.141:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}]; discovery will continue using [<.....>.166.183:9300, <.....>.165.141:9300] from hosts providers and [{d-gp2-es46-1}{mO5fbYm-T1SEooiKMpM2Ag}{hpITTfCFT4yNvibGZc7_7w}{<.....>.165.138}{<.....>.165.138:9300}{ml.machine_memory=4143783936, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 690 in term 0
[2019-04-11T01:26:47,247][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [d-gp2-es46-1] timed out while retrying [cluster:monitor/health] after failure (timeout [30s])
[2019-04-11T01:26:47,248][WARN ][r.suppressed ] [d-gp2-es46-1] path: /_cat/health, params: {pretty=, v=}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:259) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:555) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-7.0.0.jar:7.0.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:835) [?:?]
[2019-04-11T01:26:51,476][WARN ][o.e.c.c.ClusterFormationFailureHelper] [d-gp2-es46-1] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [d-gp2-es46-1., d-gp2-es46-2., d-gp2-es46-3.] to bootstrap a cluster: have discovered [{d-gp2-es46-2}{H2bib1wCSBKGu_Ku_4DgjA}{rzokY9nmRDCBNz0lBMgUYw}{<.....>.166.183}{<.....>.166.183:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-es46-3}{8KNzmk5uS2mZSZiftNVTDQ}{exvQChr7RPyDlkuJ-FT2Rg}{<.....>.165.141}{<.....>.165.141:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}]; discovery will continue using [<.....>.166.183:9300, <.....>.165.141:9300] from hosts providers and [{d-gp2-es46-1}{mO5fbYm-T1SEooiKMpM2Ag}{hpITTfCFT4yNvibGZc7_7w}{<.....>.165.138}{<.....>.165.138:9300}{ml.machine_memory=4143783936, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 690 in term 0
[2019-04-11T01:27:01,477][WARN ][o.e.c.c.ClusterFormationFailureHelper] [d-gp2-es46-1] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [d-gp2-es46-1., d-gp2-es46-2., d-gp2-es46-3.] to bootstrap a cluster: have discovered [{d-gp2-es46-2}{H2bib1wCSBKGu_Ku_4DgjA}{rzokY9nmRDCBNz0lBMgUYw}{<.....>.166.183}{<.....>.166.183:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-es46-3}{8KNzmk5uS2mZSZiftNVTDQ}{exvQChr7RPyDlkuJ-FT2Rg}{<.....>.165.141}{<.....>.165.141:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}]; discovery will continue using [<.....>.166.183:9300, <.....>.165.141:9300] from hosts providers and [{d-gp2-es46-1}{mO5fbYm-T1SEooiKMpM2Ag}{hpITTfCFT4yNvibGZc7_7w}{<.....>.165.138}{<.....>.165.138:9300}{ml.machine_memory=4143783936, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 690 in term 0

Those trailing dots look wrong. These entries should be the node names, which don't have trailing dots.

i added the <.....> .... to scrub the log.

also removed the extension on the node names but accidentally left the . on the end.

Oh, but the node names don't have any kind of extension, they're just simply d-gp2-es46-1, d-gp2-es46-2 and d-gp2-es46-3.

they do have an extension but so i'm not publicizing it to the world, i removed them.

Ok, this is kinda tricky because you're redacting or changing exactly the information we need to look at. The exact strings listed here...

must discover master-eligible nodes [d-gp2-es46-1., d-gp2-es46-2., d-gp2-es46-3.]
                                     ^^^^^^^^^^^^^  ^^^^^^^^^^^^^  ^^^^^^^^^^^^^

... must match the strings here in between the first set of braces ...

have discovered [{d-gp2-es46-2}{H2bib1wCSBKGu_Ku_4DgjA}{rzokY9nmRDCBNz0lBMgUYw}{<.....>.166.183}{<.....>.166.183:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}
                ,{d-gp2-es46-3}{8KNzmk5uS2mZSZiftNVTDQ}{exvQChr7RPyDlkuJ-FT2Rg}{<.....>.165.141}{<.....>.165.141:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}]
                  ^^^^^^^^^^^^

They don't in your post above, because of the trailing dots, and I suspect they do not in your actual installation, but it's hard to know for sure since what I'm looking at isn't quite what Elasticsearch is reporting.

ok, i setup an identical test environment and getting the same message.

cluster.name: kyle_dev
node.name: ${HOSTNAME}
node.master: true
node.data: true
node.ingest: true
node.max_local_storage_nodes: 1
path.data: /data/elasticsearch
path.logs: /var/log/elasticsearch
network.host: site
network.bind_host: 0.0.0.0
transport.tcp.port: 9300
http.port: 9200
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping_timeout: 5s
discovery.seed_hosts: ["d-gp2-kyles-1", "d-gp2-kyles-2", "d-gp2-kyles-3"]
cluster.initial_master_nodes: ["d-gp2-kyles-1", "d-gp2-kyles-2", "d-gp2-kyles-3"]

[2019-04-11T12:56:14,824][WARN ][o.e.c.c.ClusterFormationFailureHelper] [d-gp2-kyles-1] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [d-gp2-kyles-1, d-gp2-kyles-2, d-gp2-kyles-3] to bootstrap a cluster: have discovered [{d-gp2-kyles-3}{DFvnIfwDRS-g1Z3nGRuogg}{6C8s3FIvRIqzD5GEuuXRkw}{10.124.193.72}{10.124.193.72:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-kyles-2}{1hztFiznTyqRqRYCXmzb9w}{3yCmrffHRryJBeU_-WflYQ}{10.124.193.70}{10.124.193.70:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}]; discovery will continue using [10.124.193.70:9300, 10.124.193.72:9300] from hosts providers and [{d-gp2-kyles-1}{A2tvPe40SvinU51ftETHTg}{91NBxl1dSAel4_gJdakcIQ}{10.124.193.71}{10.124.193.71:9300}{ml.machine_memory=4143783936, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

Hmm, ok, and could you just confirm that this is exactly what Elasticsearch is saying without any redactions or alterations?

Could you add the following two lines to elasticsearch.yml and restart this node:

logger.org.elasticsearch.cluster.coordination.ClusterBootstrapService: TRACE
logger.org.elasticsearch.discovery: TRACE

Then it'd be great if you could share the logs emitted from startup until the first o.e.c.c.ClusterFormationFailureHelper message on https://gist.github.com/ or similar.

1 Like

[2019-04-11T13:08:06,690][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node
[2019-04-11T13:08:06,691][TRACE][o.e.d.SeedHostsResolver ] [d-gp2-kyles-1] resolved host [d-gp2-kyles-1] to [10.124.193.71:9300]
[2019-04-11T13:08:06,691][TRACE][o.e.d.SeedHostsResolver ] [d-gp2-kyles-1] resolved host [d-gp2-kyles-2] to [10.124.193.70:9300]
[2019-04-11T13:08:06,691][TRACE][o.e.d.SeedHostsResolver ] [d-gp2-kyles-1] resolved host [d-gp2-kyles-3] to [10.124.193.72:9300]
[2019-04-11T13:08:06,691][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] probing resolved transport addresses [10.124.193.70:9300, 10.124.193.72:9300]
[2019-04-11T13:08:06,692][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] Peer{transportAddress=10.124.193.72:9300, discoveryNode={d-gp2-kyles-3}{DFvnIfwDRS-g1Z3nGRuogg}{deJGbHiTSu-DUE7mBvtYvQ}{10.124.193.72}{10.124.193.72:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, peersRequestInFlight=true} received PeersResponse{masterNode=Optional.empty, knownPeers=[{d-gp2-kyles-1}{A2tvPe40SvinU51ftETHTg}{5A0lXrEVRkuB4XwEv3pXcA}{10.124.193.71}{10.124.193.71:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-kyles-2}{1hztFiznTyqRqRYCXmzb9w}{6dF894WOQSKczOskBSABcQ}{10.124.193.70}{10.124.193.70:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}], term=0}
[2019-04-11T13:08:06,692][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node
[2019-04-11T13:08:06,693][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] Peer{transportAddress=10.124.193.70:9300, discoveryNode={d-gp2-kyles-2}{1hztFiznTyqRqRYCXmzb9w}{6dF894WOQSKczOskBSABcQ}{10.124.193.70}{10.124.193.70:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, peersRequestInFlight=true} received PeersResponse{masterNode=Optional.empty, knownPeers=[{d-gp2-kyles-1}{A2tvPe40SvinU51ftETHTg}{5A0lXrEVRkuB4XwEv3pXcA}{10.124.193.71}{10.124.193.71:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-kyles-3}{DFvnIfwDRS-g1Z3nGRuogg}{deJGbHiTSu-DUE7mBvtYvQ}{10.124.193.72}{10.124.193.72:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}], term=0}
[2019-04-11T13:08:06,693][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node
[2019-04-11T13:08:06,949][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node
[2019-04-11T13:08:07,173][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node
[2019-04-11T13:08:07,691][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] Peer{transportAddress=10.124.193.72:9300, discoveryNode={d-gp2-kyles-3}{DFvnIfwDRS-g1Z3nGRuogg}{deJGbHiTSu-DUE7mBvtYvQ}{10.124.193.72}{10.124.193.72:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, peersRequestInFlight=false} requesting peers
[2019-04-11T13:08:07,692][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] Peer{transportAddress=10.124.193.70:9300, discoveryNode={d-gp2-kyles-2}{1hztFiznTyqRqRYCXmzb9w}{6dF894WOQSKczOskBSABcQ}{10.124.193.70}{10.124.193.70:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, peersRequestInFlight=false} requesting peers
[2019-04-11T13:08:07,693][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] probing master nodes from cluster state: nodes:
{d-gp2-kyles-1}{A2tvPe40SvinU51ftETHTg}{5A0lXrEVRkuB4XwEv3pXcA}{10.124.193.71}{10.124.193.71:9300}{ml.machine_memory=4143783936, xpack.installed=true, ml.max_open_jobs=20}, local

[2019-04-11T13:08:07,693][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node
[2019-04-11T13:08:07,694][TRACE][o.e.d.SeedHostsResolver ] [d-gp2-kyles-1] resolved host [d-gp2-kyles-1] to [10.124.193.71:9300]
[2019-04-11T13:08:07,694][TRACE][o.e.d.SeedHostsResolver ] [d-gp2-kyles-1] resolved host [d-gp2-kyles-2] to [10.124.193.70:9300]
[2019-04-11T13:08:07,694][TRACE][o.e.d.SeedHostsResolver ] [d-gp2-kyles-1] resolved host [d-gp2-kyles-3] to [10.124.193.72:9300]
[2019-04-11T13:08:07,694][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] probing resolved transport addresses [10.124.193.70:9300, 10.124.193.72:9300]
[2019-04-11T13:08:07,695][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] Peer{transportAddress=10.124.193.70:9300, discoveryNode={d-gp2-kyles-2}{1hztFiznTyqRqRYCXmzb9w}{6dF894WOQSKczOskBSABcQ}{10.124.193.70}{10.124.193.70:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, peersRequestInFlight=true} received PeersResponse{masterNode=Optional.empty, knownPeers=[{d-gp2-kyles-1}{A2tvPe40SvinU51ftETHTg}{5A0lXrEVRkuB4XwEv3pXcA}{10.124.193.71}{10.124.193.71:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-kyles-3}{DFvnIfwDRS-g1Z3nGRuogg}{deJGbHiTSu-DUE7mBvtYvQ}{10.124.193.72}{10.124.193.72:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}], term=0}
[2019-04-11T13:08:07,695][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] Peer{transportAddress=10.124.193.72:9300, discoveryNode={d-gp2-kyles-3}{DFvnIfwDRS-g1Z3nGRuogg}{deJGbHiTSu-DUE7mBvtYvQ}{10.124.193.72}{10.124.193.72:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, peersRequestInFlight=true} received PeersResponse{masterNode=Optional.empty, knownPeers=[{d-gp2-kyles-1}{A2tvPe40SvinU51ftETHTg}{5A0lXrEVRkuB4XwEv3pXcA}{10.124.193.71}{10.124.193.71:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}, {d-gp2-kyles-2}{1hztFiznTyqRqRYCXmzb9w}{6dF894WOQSKczOskBSABcQ}{10.124.193.70}{10.124.193.70:9300}{ml.machine_memory=4143783936, ml.max_open_jobs=20, xpack.installed=true}], term=0}
[2019-04-11T13:08:07,695][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node
[2019-04-11T13:08:07,695][TRACE][o.e.d.PeerFinder ] [d-gp2-kyles-1] startProbe(10.124.193.71:9300) not probing local node

can i attach the log somehow?

Sorry, there's no attachments in this forum system, but it normally works well to use http://gist.github.com/ or https://pastebin.com.

basically it was that just repeated a thousand times

The details are, unfortunately, important. There's something very unexpected going on here, and the clue to the solution will be somewhere in these logs. There will also be quite a few log messages that don't look like this too.

can i email you the log?

Sure, I'm david.turner@elastic.co

https://pastebin.com/C1scknA6

Thanks. I'm utterly baffled right now:

[2019-04-11T13:06:50,465][TRACE][o.e.c.c.ClusterBootstrapService] [d-gp2-kyles-1] nodesMatchingRequirements=[], unsatisfiedRequirements=[d-gp2-kyles-1, d-gp2-kyles-2, d-gp2-kyles-3], bootstrapRequirements=[d-gp2-kyles-1, d-gp2-kyles-2, d-gp2-kyles-3]
[2019-04-11T13:06:50,470][INFO ][o.e.c.c.ClusterBootstrapService] [d-gp2-kyles-1] skipping cluster bootstrapping as local node does not match bootstrap requirements: [d-gp2-kyles-1, d-gp2-kyles-2, d-gp2-kyles-3]

These two messages are saying that the local node called d-gp2-kyles-1 doesn't match the d-gp2-kyles-1 in cluster.initial_master_nodes. As far as I can tell they're byte-for-byte equal, although one of them was typed directly into the config file and the other comes from the environment. Could you try setting node.name: d-gp2-kyles-1 explicitly in elasticsearch.yml instead of using ${HOSTNAME} so that they both come from the same place?

You can lose logger.org.elasticsearch.discovery: TRACE too, at least until we work out what the ClusterBootstrapService is struggling with.

2 Likes

https://pastebin.com/NcV6NEXH