Data nodes are not able to join master node and failed to make a cluster

Hi All,

I have created elasticsearch cluster beofre and it is working fine without any issue with the same configuration as given below. But I am not sure
why I am getting message "not enough master nodes discovered during pinging (found [[]], but needed [1])," while creating a new cluster having new
servers and nodes are not able to join the master node. I am fed up with this issue so I am asking for your help. Please let me know what I am doing wrong here.

Master Node -
cluster.name: cluster_dashbaord
node.name: ${HOSTNAME}
node.master: true
node.data: false
path.data: /var/fpwork/workspace_smtrs/elasticsearch-6.4.0/data
path.logs: /var/fpwork/workspace_smtrs/elasticsearch-6.4.0/logs
discovery.zen.ping.unicast.hosts: ["10.182.197.100","10.182.197.101","10.182.197.102"]
bootstrap.system_call_filter: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0
transport.tcp.compress: true
http.port : 9400
transport.tcp.port: 9500
discovery.zen.minimum_master_nodes: 1
network.publish_host: 0.0.0.0
network.bind_host: 0.0.0.0

Data Node 1 -
cluster.name: cluster_dashbaord
node.name: ${HOSTNAME}
node.master: false
node.data: true
path.data: /var/fpwork/workspace_smtrs/elasticsearch-6.4.0/data
path.logs: /var/fpwork/workspace_smtrs/elasticsearch-6.4.0/logs
discovery.zen.ping.unicast.hosts: ["10.182.197.100","10.182.197.101","10.182.197.102"]
bootstrap.system_call_filter: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0
transport.tcp.compress: true
http.port : 9400
transport.tcp.port: 9500
discovery.zen.minimum_master_nodes: 1
network.publish_host: 0.0.0.0

Data Node2 -
cluster.name: cluster_dashbaord
node.name: ${HOSTNAME}
node.master: false
node.data: true
path.data: /var/fpwork/workspace_smtrs/elasticsearch-6.4.0/data
path.logs: /var/fpwork/workspace_smtrs/elasticsearch-6.4.0/logs
discovery.zen.ping.unicast.hosts: ["10.182.197.100","10.182.197.101","10.182.197.102"]
bootstrap.system_call_filter: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0
transport.tcp.compress: true
http.port : 9400
transport.tcp.port: 9500
discovery.zen.minimum_master_nodes: 1
network.publish_host: 0.0.0.0

Console Snapshot

[2018-09-07T00:30:46,305][WARN ][o.e.d.z.ZenDiscovery ] [tr_cloud2] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again
[2018-09-07T00:30:49,306][WARN ][o.e.d.z.ZenDiscovery ] [tr_cloud2] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again
[2018-09-07T00:30:52,307][WARN ][o.e.d.z.ZenDiscovery ] [tr_cloud2] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again
[2018-09-07T00:30:55,297][WARN ][o.e.n.Node ] [tr_cloud2] timed out while waiting for initial discovery state - timeout: 30s[2018-09-07T00:30:55,308][WARN ][o.e.d.z.ZenDiscovery ] [trs_cloud2] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again
[2018-09-07T00:30:55,315][INFO ][o.e.x.s.t.n.SecurityNetty4HttperverTransport] [trs_cloud2] publish_address {192.168.0.37:9400}, bound_addresses {[::]:9400}
[2018-09-07T00:30:55,316][INFO ][o.e.n.Node ] [tr_cloud2] started[2018-09-07T00:30:58,309][WARN ][o.e.d.z.ZenDiscovery ] [trs_cloud2] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again

@dadoonet Can you please help me. I am totally frustrated with this issue and not able to find out the problem.

As you have used a non-default transport port (9500), you need to include the port number in the unicast host list. I would also recommend making all node master eligible and setting minimum_master_nodes to 2 in order to improve stability and resilience.

1 Like

Yeah. Use:

discovery.zen.ping.unicast.hosts: ["10.182.197.100:9500","10.182.197.101:9500","10.182.197.102:9500"]

@Christian_Dahlqvist When I changed the minimum_master_nodes to 2 and with below config discovery.zen.ping.unicast.hosts: ["10.182.197.102:9500","10.182.197.132:9500","10.182.197.160:9500"] I am getting below given error.

[2018-09-07T14:22:35,361][WARN ][o.e.d.z.ZenDiscovery ] [-scrum] not enough master nodes discovered during pinging (found [[Candidate{node={vikrant-scrum}{ixDYsLpQQzyn-30o4j4iXA}{d4jAeq96Tk-1G8FM4aOCUw}{192.168.0.37}{192.168.0.37:9500}{ml.machine_memory=67560501248, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-09-07T14:22:38,362][WARN ][o.e.d.z.ZenDiscovery ] [-scrum] not enough master nodes discovered during pinging (found [[Candidate{node={vikrant-scrum}{ixDYsLpQQzyn-30o4j4iXA}{d4jAeq96Tk-1G8FM4aOCUw}{192.168.0.37}{192.168.0.37:9500}{ml.machine_memory=67560501248, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-09-07T14:22:41,363][WARN ][o.e.d.z.ZenDiscovery ] [-scrum] not enough master nodes discovered during pinging (found [[Candidate{node={vikrant-scrum}{ixDYsLpQQzyn-30o4j4iXA}{d4jAeq96Tk-1G8FM4aOCUw}{192.168.0.37}{192.168.0.37:9500}{ml.machine_memory=67560501248, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again

Did you make all your nodes master-eligible (node.master: true) as well?

`

Yes I did.

Are there any firewall rules in place that prevents the nodes to connect to port 9500 on the other hosts? Test this by logging into one of the hosts and then telnet to port 9500 on the others.

Configuration with minimum_master_nodes to 1 and below configuration.
discovery.zen.ping.unicast.hosts: ["10.182.197.100:9500","10.182.197.101:9500","10.182.197.102:9500"]. I am still facing the same issue. Please suggest

Console Ouput -
[2018-09-07T14:29:29,082][INFO ][o.e.n.Node ] [-scrum] starting ...
[2018-09-07T14:29:29,216][INFO ][o.e.t.TransportService ] [-scrum] publish_address {192.168.0.37:9500}, bound_addresses {[::]:9500}
[2018-09-07T14:29:29,229][INFO ][o.e.b.BootstrapChecks ] [-scrum] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-09-07T14:29:32,249][WARN ][o.e.d.z.ZenDiscovery ] [-scrum] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again
[2018-09-07T14:29:35,250][WARN ][o.e.d.z.ZenDiscovery ] [-scrum] not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again

Please verify that there are no network or connectivity issues.

I have tried doing telnet with port 9500 but failed now but previously telnet was fine and at that time also I was facing the same issue. How to solve this telnet issue by stopping firewall ?

That depends on your operating system and how it is set up, so I do not think I can help there.

I am running Elastic on RHEL.

Whenever I am changing port to 9200 and 9300 it is giving below error and not able to find the root cause.
http.port : 9200
transport.tcp.port: 9300

Console Log-
[2018-09-07T14:41:52,662][WARN ][o.e.d.z.ZenDiscovery ] [-scrum] failed to connect to master [{dhananjay-test-1}{OcBKCmBZQy2nZvFoaG7p7A}{QL0sGP3vQhG8KGWmA3RUdw}{192.168.0.48}{192.168.0.48:9300}{ml.machine_memory=25284042752, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], retrying...
org.elasticsearch.transport.ConnectTransportException: [dhananjay-test-1][192.168.0.48:9300] connect_exception
at org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:165) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:643) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:542) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:329) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:316) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:507) [elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:475) [elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:88) [elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1245) [elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.4.0.jar:6.4.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: 192.168.0.48/192.168.0.48:9300

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.