Hello,
I'm trying to set up two ES node in cluster nodes, and has absolutely given up as I'm posting this.
I've two instance running, with following configs -
# es-01
cluster.name: dc-world
node.name: smallville
path.data: /es/data
path.logs: /es/log
network.host: 0.0.0.0
discovery.seed_hosts: ["10.116.48.116"] # IP of es-02
cluster.initial_master_nodes: ["gotham", "smallville"]
# es-02
cluster.name: dc-world
node.name: gotham
path.data: /es/data
path.logs: /es/log
network.host: 0.0.0.0
network.publish_host: _site_
cluster.initial_master_nodes: ["gotham", "smallville"]
I'm getting following error (from es-01 logs) (stack trace reduced; due to post limit) -
[2019-07-19T06:33:02,824][WARN ][o.e.c.c.ClusterFormationFailureHelper] [smallville] master not discovered or elected yet, an election requires two nodes with ids [JYKLqIsvR0yruH0ecPG4wA, CTAdAiYbT-ajAXtSVEs3Bw], have discovered [{gotham}{CTAdAiYbT-ajAXtSVEs3Bw}{V18vdJW3TyKlfMJIJhlGkQ}{10.116.48.116}{10.116.48.116:9300}{ml.machine_memory=16651354112, ml.max_open_jobs=20, xpack.installed=true}] which is not a quorum; discovery will continue using [10.116.48.116:9300] from hosts providers and [{smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 149, last-accepted version 0 in term 0
[2019-07-19T06:33:06,904][INFO ][o.e.c.c.JoinHelper ] [smallville] failed to join {smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20} with JoinRequest{sourceNode={smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=150, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}, targetNode={smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}}]}
org.elasticsearch.transport.RemoteTransportException: [smallville][161.202.2.237:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: received a newer join from {smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}
at org.elasticsearch.cluster.coordination.JoinHelper$CandidateJoinAccumulator.handleJoinRequest(JoinHelper.java:451) [elasticsearch-7.2.0.jar:7.2.0]
......
....
es-02 logs -
[2019-07-19T06:37:45,288][WARN ][o.e.c.c.ClusterFormationFailureHelper] [gotham] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [gotham, smallville] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9301, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{gotham}{CTAdAiYbT-ajAXtSVEs3Bw}{V18vdJW3TyKlfMJIJhlGkQ}{10.116.48.116}{10.116.48.116:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 218, last-accepted version 0 in term 0
[2019-07-19T06:38:28,567][INFO ][o.e.c.c.JoinHelper ] [gotham] failed to join {smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={gotham}{CTAdAiYbT-ajAXtSVEs3Bw}{V18vdJW3TyKlfMJIJhlGkQ}{10.116.48.116}{10.116.48.116:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=225, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={gotham}{CTAdAiYbT-ajAXtSVEs3Bw}{V18vdJW3TyKlfMJIJhlGkQ}{10.116.48.116}{10.116.48.116:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}, targetNode={smallville}{JYKLqIsvR0yruH0ecPG4wA}{-9WFnGHqQnaxid1cV4oAHg}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.NodeNotConnectedException: [smallville][161.202.2.237:9300] Node not connected
at org.elasticsearch.transport.ConnectionManager.getConnection(ConnectionManager.java:151) ~[elasticsearch-7.2.0.jar:7.2.0]
...
...
This has been after numerous failed attempt, and after reading/digesting everything I could find (forming single node clusters; security group rule issues; understanding 9200 vs 9300; almost failing to understand unicast; reading articles about 6.8's zen discovery)
Before I added network.publish_host: _site_
in es-02 config, I was getting this in the logs (from es-01; similar on es-02) -
[2019-07-19T05:53:55,034][WARN ][o.e.c.c.ClusterFormationFailureHelper] [smallville] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [10.116.48.112, 10.116.48.116] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9301, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{smallville}{8MEe4svaTseS-P3BwzjK9A}{ecMet_NNRv2arG-ZAuuq7g}{161.202.2.237}{161.202.2.237:9300}{ml.machine_memory=16651354112, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
If you need any more info; please let me know.
I'm on version 7.2.0 (build 508c38a from rpm). OS - CentOS 7