Cannot bootstrap a new cluster: master not discovered or elected yet

decibel83 · September 17, 2019, 4:46pm

Hi,
after having some problems on another test cluster (I asked help on this post) I'm trying to setup a new cluster from scratch to understand what's happening.

I'm stuck because this it the third time I'm trying to run the new cluster but the master host cannot be discovered.

My three nodes are:

elastic1.domain.com: 192.168.245.71
elastic2.domain.com: 192.168.245.72
elastic3.domain.com: 192.168.245.72

Every node is on the same network and all nodes can see each others, there are no firewall rules, Elasticsearch is running on every nodes and it's listening on their own IPs. Every node is both master and data.

This is the node configuration:

cluster.name: mycluster
node.name: elastic[1:3].domain.com
node.master: true
node.data: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

network.host:
  - 127.0.0.1
  - 192.168.245.7[1:3]

network.publish_host: 192.168.245.7[1:3]

discovery.seed_hosts:
  - elastic1.domain.com
  - elastic2.domain.com
  - elastic3.domain.com

cluster.initial_master_nodes:
  - elastic1.domain.com
  - elastic2.domain.com
  - elastic3.domain.com

I started every node from scratch, with /var/lib/elasticsearch folder empty and one by one. I tried to repeat the procedure two times without any successs.

In the logs I see many errors like these:

[2019-09-17T18:19:57,497][INFO ][o.e.c.c.JoinHelper       ] [elastic3.domain.com] failed to join {elastic2.domain.com}{hwnyk1WnRYmD9yktsXg73g}{SZgLY6V2TkS_Ossu0gUxJw}{192.168.245.72}{192.168.245.72:9300}{dim}{ml.machine_memory=16822104064, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={elastic3.domain.com}{KjVRP9WgTQy-P8JV18LbDw}{bVvohjzoQ0C5LQcUKjckew}{192.168.245.73}{192.168.245.73:9300}{dim}{ml.machine_memory=16822104064, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=35, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={elastic3.domain.com}{KjVRP9WgTQy-P8JV18LbDw}{bVvohjzoQ0C5LQcUKjckew}{192.168.245.73}{192.168.245.73:9300}{dim}{ml.machine_memory=16822104064, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elastic2.domain.com}{hwnyk1WnRYmD9yktsXg73g}{SZgLY6V2TkS_Ossu0gUxJw}{192.168.245.72}{192.168.245.72:9300}{dim}{ml.machine_memory=16822104064, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.NodeDisconnectedException: [elastic2.domain.com][192.168.245.72:9300][internal:cluster/coordination/join] disconnected
[2019-09-17T18:19:57,501][INFO ][o.e.c.c.Coordinator      ] [elastic3.domain.com] master node [{elastic2.domain.com}{hwnyk1WnRYmD9yktsXg73g}{SZgLY6V2TkS_Ossu0gUxJw}{192.168.245.72}{192.168.245.72:9300}{dim}{ml.machine_memory=16822104064, ml.max_open_jobs=20, xpack.installed=true}] failed, restarting discovery
org.elasticsearch.transport.ConnectTransportException: [elastic2.domain.com][192.168.245.72:9300] disconnected during check
    at org.elasticsearch.cluster.coordination.LeaderChecker$CheckScheduler$1.handleException(LeaderChecker.java:268) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:544) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.cluster.coordination.LeaderChecker$CheckScheduler.handleWakeUp(LeaderChecker.java:237) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.cluster.coordination.LeaderChecker.updateLeader(LeaderChecker.java:150) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.cluster.coordination.Coordinator.becomeFollower(Coordinator.java:620) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.cluster.coordination.Coordinator.onFollowerCheckRequest(Coordinator.java:243) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.cluster.coordination.FollowersChecker$2.doRun(FollowersChecker.java:187) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:758) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.3.2.jar:7.3.2]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
    at java.lang.Thread.run(Thread.java:835) [?:?]
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic2.domain.com][192.168.245.72:9300] Node not connected
    at org.elasticsearch.transport.ConnectionManager.getConnection(ConnectionManager.java:151) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.transport.TransportService.getConnection(TransportService.java:568) ~[elasticsearch-7.3.2.jar:7.3.2]
    at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:540) ~[elasticsearch-7.3.2.jar:7.3.2]
    ... 10 more

Could you help me please?

I am really hanged on this problem.
Thanks!

rugenl · September 17, 2019, 5:36pm

Is the [1:3] really in your elasticsearch.yml or is that just doc for the post?

I don't think you can use multiple addresses in network.host nor that[1:3] notation.

Try network.host: 0.0.0.0 and remove network.publish_host.

decibel83 · September 17, 2019, 7:16pm

[1:3] is not really written in the configuration file.
It's a way to document that each node has its own IP address specified into the configuration file.

The real configuration parameters are:

elastic1:

network.host:
  - 127.0.0.1
  - 192.168.245.71

elastic2:

network.host:
  - 127.0.0.1
  - 192.168.245.72

elastic3:

network.host:
  - 127.0.0.1
  - 192.168.245.73

Thanks!

DavidTurner · September 18, 2019, 7:07am

The few log messages you have shared are suggestive of network issues: it looks like a connection between the nodes is being established and then dropped when the nodes start to exchange meaningful information.

One possibility: do you have any kind of security device (e.g. firewall or IDS) which might be considering the Elasticsearch traffic as suspicious and dropping these connections? Elasticsearch will be exchanging information containing IP addresses and host names and so on and it's certainly possible that a badly-configured IDS could be triggered by that kind of traffic. If so, either disable it or else enable TLS on your cluster so it can't see the traffic any more.

system · October 16, 2019, 7:20am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
We are setting up 3 node Elasticsearch cluster Elasticsearch	3	223	May 24, 2022
Bootstrapping faild Elasticsearch	2	384	January 12, 2020
Master not discovered in the bootstrap cluster Elasticsearch	3	354	September 13, 2021
Problems with bootstrapping my 2-node cluster. Master node not found Elasticsearch	10	963	February 24, 2020
Elasticsearch Master Not discovered Elasticsearch	3	787	July 17, 2023

Cannot bootstrap a new cluster: master not discovered or elected yet

Related topics