Elastic migrating from 6.8.8 to 7.10.2 in cluster mode : 1st node not starting until 2nd node starts in earlier version this was not the case

I am making 3 node cluster. Below is my elastic.yml file
cluster.name: SpectrumCluster
node.name: node-1
cluster.initial_master_nodes: 10.2.107.14:9300,172.30.6.22:9300,172.30.5.255:9300
discovery.seed_hosts: 10.2.107.14:9300,172.30.6.22:9300,172.30.5.255:9300

When I am starting all three node simultaneously then its working fine.
But when I only start node 1 and dont start node 2 and node 3. Then even node 1 does not initialized and get below error

[2021-11-18T08:49:28,273][DEBUG][o.e.d.PeerFinder ] [node-1] Peer{transportAddress=172.30.5.255:9300, discoveryNode=null, peersRequestInFlight=false} connection failed
org.Elasticsearch.transport.ConnectTransportException: [172.30.5.255:9300] connect_exception
at org.Elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:978) ~[Elasticsearch-7.10.2.jar:7.10.2]
at org.Elasticsearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:198) ~[Elasticsearch-7.10.2.jar:7.10.2]
at org.Elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[Elasticsearch-core-7.10.2.jar:7.10.2]

This was not the case with previous version 6.8.8 where 1st node was starting gracefully even if 2nd node is down.
Is there way in 7.10.2 where we can start 1st node gracefully even if 2nd node down

If your 6.8 cluster started up correctly with only 1 of 3 master eligible nodes present it was misconfigured and did not have minimum_master_nodes correctly set to 2. This can lead to data loss. A 3 node cluster shoule always need a majority of master eligible nodes available in order to elect a master (in this case 2) and Elasticsearch 7.x now enforces this and no longer allows this quite common misconfiguration. Have a look at these docs for more information.

I agree with you I might configured wrongly in 6.8 but our application is work in such a way that initially elastic 1st node should start and later it will join that other node 1 and node 2. As 7.x .x not allowing one node to start until it join 2nd node. I am looking for some hacky way so that it does not became blocker to me.

There is as far as I know no hacky way around this. It was specifically designed to prevent this type of misconfiguration as it can cause silent data loss.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.