ES.6 Node can't join master (Docker)

cy_lir · October 22, 2018, 3:07pm

Hi !

At first, here's my following architecture :

1 Docker swarm cluster running 2 Elasticsearch 6.4.2 nodes without troubles. They're called isbg01 and isbg02. IP : 10.11.0.10
1 Docker no-swarm in the same datacentre but on a different VM, running 1 Elasticsearch 6.4.2, isbg03. IP : 10.11.0.12

All ports are open (iptables INPUT/OUTPUT/FORWARD on accept).
isbg01 and isbg02 form a cluster and works perfectly well, however, when I try to add isbg03 to the cluster, the following message appears in the isbg03 logs :

    [2018-10-22T14:42:03,732][WARN ][o.e.d.z.ZenDiscovery     ] [isbg03] failed to connect to master [{isbg02}{CLIuWm25TrC6lsNNRL2-0w}{Vvh09X8ISHivXNjSENCJKw}{10.0.2.21}{10.0.2.21:9300}{ml.machine_memory=5182058496, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], retrying...
org.elasticsearch.transport.ConnectTransportException: [isbg02][10.0.2.21:9300] general node connection failure
        at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:688) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:542) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:329) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:316) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:507) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:475) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:88) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1245) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.4.2.jar:6.4.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:844) [?:?]
Caused by: java.lang.IllegalStateException: java.lang.InterruptedException
        at org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:153) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:643) ~[elasticsearch-6.4.2.jar:6.4.2]
        ... 11 more
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1079) ~[?:?]
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1367) ~[?:?]
        at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:234) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:69) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:147) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:643) ~[elasticsearch-6.4.2.jar:6.4.2]
        ... 11 more

Yet there's nothing to be found in isbg02's logs.

To simplify the problem, I've temporary removed isbg01, and slightly modified config files to try to connect isbg02 and isbg03 together.

Here are the configuration files :

cluster.name: isbg
node.name: isbg02
discovery.zen.ping.unicast.hosts: ["10.11.0.12"]
discovery.zen.minimum_master_nodes: 2
path.data: /var/lib/elasticsearch
network.host: 0.0.0.0

.

cluster.name: isbg
node.name: isbg03
discovery.zen.ping.unicast.hosts: ["10.11.0.10:9301"]
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2

Curl-ing nodes from each other works (port 9200 gives me the traditionnal JSON, 9300 for isbg03 and 9301 for isbg02 gives me the "this is not an HTTP port".

I'm running out of ideas here. It seems that the firewall isn't blocking anything and they're running the same ES version.
Any idea on how to solve this ?

Thanks a lot !

cy_lir · October 23, 2018, 8:44am

Solved !

I had to set

network.publish_host: <VM IP>
transport.tcp.port: 9301 #If different from 9200
http.port: 9301

and bind with docker port 9301 to 9301 (instead of doing 9300 to 9301)

system · November 20, 2018, 8:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.