Master not discovered or elected yet, an election requires a node with id

Hi,

version: 7.0.1
I have 2 nodes in my local with the following configuration:
First one yml details :
node.max_local_storage_nodes: 10
node.name: node-1
node.data: true
node.master: true
transport.tcp.port: 9305
transport.publish_port: 9305
discovery.seed_hosts: ["node-1:9305", "node-2:9305"]

second one yml details :
node.max_local_storage_nodes: 10
http.port: 9201
node.name: node-2
node.data: true
node.master: true
transport.tcp.port: 9300
transport.publish_port: 9300
discovery.seed_hosts: ["node-1:9300", "node-2:9300"]

when i start first node with localhost:9200 its working fine.
when i start the second node its showing the below error and first node is working fine

master not discovered or elected yet, an election requires a node with id [C84W6hvTT3uX-lz0IAmhhA], have discovered which is not a quorum; discovery will continue using [172.23.249.2:9300, 172.23.249.3:9300] from hosts providers and [{node-2}{tZ1lBzg3SMeYtFK4oo09Ww}{UbiDQC2OQ5GV0RSjDw_5og}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=8468369408, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 30, last-accepted version 472 in term 30

Please tell the reason for the above error. I also want to why we need node.max_local_storage_nodes and what is the use of transport.tcp.port: 9300
& transport.publish_port: 9300. Thanks!

You should contain all master-eligible nodes in discovery.seed_hosts. In your example, it's

discovery.seed_hosts: ["node-1:9305", "node-2:9300"]

Thanks for your reply wang.

I have changed the yml configuration in my local like below now :

First Node yml details :
http.port : 9200
node.name: node-1
network.host : 10.65.32.206
network.publish_host: 10.65.32.206
node.data: true
node.master: true
transport.tcp.port: 9305
transport.publish_port: 9305
discovery.seed_hosts: ["10.65.32.206:9305", "10.65.32.206:9300"]
cluster.name : eslocal

Second node yml details :
http.port: 9201
node.name: node-2
network.host : 10.65.32.206
network.publish_host: 10.65.32.206
node.master: true
node.data: true
transport.tcp.port: 9300
transport.publish_port: 9300
discovery.seed_hosts: ["10.65.32.206:9305", "10.65.32.206:9300"]
cluster.name : eslocal

Now 2 nodes are up and running fine. But when i stop the first node in the second node im getting the same error like below :

org.elasticsearch.transport.ConnectTransportException: [node-1][10.65.32.206:9305] connect_exception
at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1299) ~[elasticsearch-7.0.1.jar:7.0.1]
at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:99) ~[elasticsearch-7.0.1.jar:7.0.1]
at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-7.0.1.jar:7.0.1]
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2159) ~[?:?]
at org.elasticsearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:57) ~[elasticsearch-core-7.0.1.jar:7.0.1]
at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$new$1(Netty4TcpChannel.java:72) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[?:?]
at java.lang.Thread.run(Thread.java:835) [?:?]
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: 10.65.32.206/10.65.32.206:9305
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
... 6 more
Caused by: java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
... 6 more
[2019-06-22T15:08:20,988][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-2] master not discovered or elected yet, an election requires a node with id [C84W6hvTT3uX-lz0IAmhhA], have discovered which is not a quorum; discovery will continue using [10.65.32.206:9305] from hosts providers and [{node-1}{C84W6hvTT3uX-lz0IAmhhA}{vGP6IBDCR1yxeko5W65ElQ}{10.65.32.206}{10.65.32.206:9305}{ml.machine_memory=8468365312, ml.max_open_jobs=20, xpack.installed=true}, {node-2}{tZ1lBzg3SMeYtFK4oo09Ww}{yDstkyZ6QXWHI-xmt0XvvA}{10.65.32.206}{10.65.32.206:9300}{ml.machine_memory=8468365312, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 57, last-accepted version 930 in term 57

Requesting you to please help me on this. Thanks!

That looks like the expected behaviour. If you want the cluster to keep on working when you stop a node then you need at least 3 nodes.

Thanks for your reply David.

Q1:Is this the new change from 7.0.1 version that we need 3 nodes to keep the cluster working if any one of node goes down ?

Because previously I have installed 6.2 version 2 node cluster in 2 different servers and it was working fine if any one of the node goes down.

Q2 : I have installed 2 nodes on my local so it is not working? Is it I have to install in 2 different servers to work properly if any one of the node goes down for 2 node cluster ?

Requesting you to please clarify the above questions.Thanks!

No, this constraint is a fundamental property of distributed systems and is not something that Elasticsearch can change. It is mentioned in the documentation for every version since this section of the manual for 1.4. Before 7.0 it was possible to misconfigure a 2-node cluster as you describe, seriously risking data loss, but this bug is fixed in 7.0.

It depends what you mean by "work properly". If you put all your nodes on a single server and then that server suffers a hardware failure then of course your cluster will stop working.

Thanks for your reply david.

Work properly means if i have 2 node cluster and if one of the node goes down the second node should work fine. So is this possible if have installed 2 nodes on 2 different servers. Right now i have installed 2 nodes on my local.
I have installed 2 nodes on my local so because of this am i facing this issue(means if one goes down the second will throw an error as "master not discovered or elected yet") .

Thanks!

No, this is not possible. For this you need at least three nodes, no matter where they are running. They can all run on a single server if you want.

Okay David. Thanks for your quick reply. I will try adding one more node.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.