Elasticsearch Master not discovered exception


#1

I'm running a 5 node elasticsearch cluster (2 data nodes, 2 master nodes, 1 kibana).

I'm getting the following error when use the command

curl -X GET "192.168.107.75:9200/_cat/master?v"
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null} ],"type":"master_not_discovered_exception","reason":null},"status":503}

I'm using the following command to run elastic

sudo systemctl start elasticsearch.service

This is the message I see in the logs:

[2018-05-28T21:02:22,074][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:25,076][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:28,077][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:31,079][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:34,081][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:37,084][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] not enough master nodes discovered during pinging (found [[Candidate{node={node-master-1}{kJKYkpdbTKmdIeq-RVnCAQ}{JGbXMxOXR0SyjCu746Zlwg}{192.168.107.75}{192.168.107.75:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-05-28T21:02:40,090][WARN ][o.e.d.z.ZenDiscovery     ] [node-master-1] failed to connect to master [{node-master-2}{_M4BTrFbQguT3PbY5d2_JA}{1rzJcDPSQ5OH2OZ_CnhR-g}{192.168.107.76}{192.168.107.76:9300}], retrying...
org.elasticsearch.transport.ConnectTransportException: [node-master-2][192.168.107.76:9300] connect_exception
    at org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:165) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:616) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:513) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:331) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:318) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:515) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:483) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:90) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1253) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.4.jar:6.2.4]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_172]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_172]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: 192.168.107.76/192.168.107.76:9300
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
    ... 1 more
Caused by: java.net.NoRouteToHostException: No route to host
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
    ... 1 more

In the ealsticsearch.yml file apart from the config for assigning different roles to nodes I'm using the following configuration:

cluster.name: test_cluster
network.host: 192.168.107.71
discovery.zen.ping.unicast.hosts: ["192.168.107.73", "192.168.107.74", "192.168.107.75", "192.168.107.76"]
#the above two configuration IPs change as per the node
discovery.zen.minimum_master_nodes: 2
The hosts are pingable and have access to each other.

Any help would be much appreciated.


(Mark Walkom) #2

That's why, You should check your networking.

That's a really bad idea, see https://www.elastic.co/guide/en/elasticsearch/guide/2.x/important-configuration-changes.html#_minimum_master_nodes


#3

The hosts are reachable (telnet 9300) and also two nodes are master eligible, which is a valid configuration.


(David Pilato) #4

2 master nodes might be valid configuration indeed. It's just a very bad practice.

BTW just put the master eligible nodes in this list:

discovery.zen.ping.unicast.hosts: ["192.168.107.73", "192.168.107.74", "192.168.107.75", "192.168.107.76"]

(R_C) #5

Are both the ports 9200 and 9300 enabled on all the nodes ??


#6

The master eligible nodes are on the list and both ports 9200 as well 9300 are open.


(R_C) #7

Are your master nodes , dedicated or data+master nodes ??


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.