Elasticsearch 7, 3 nodes cluster setup, 1 failed to join

rozzaaq · May 2, 2019, 7:38am

I tried to configure a cluster with 3 nodes on GCP.

2 nodes join the same cluster (same UUID), 1 nodes fail to join.

Here's my nodes configuration:

    cluster.name: rdy-elastic
    node.name: rdy-elastic2
    network.host: _site_
    discovery.seed_hosts:
        - 10.148.0.25
        - 10.148.0.26
        - 10.148.0.28
    cluster.initial_master_nodes:
        - 10.148.0.25
        - 10.148.0.26
        - 10.148.0.28
    xpack.security.enabled: true

Here's the cluster created:

    curl -XGET  10.148.0.26:9200/_cat/nodes
    10.148.0.28 7 95 6 0.00 0.03 0.05 mdi * rdy-elastic5
    10.148.0.26 7 96 6 0.00 0.01 0.01 mdi - rdy-elastic3
    10.148.0.25                       mdi - rdy-elastic2

Here's the node rdy-elastic2 logs:

> [2019-05-02T07:34:52,277][INFO ][o.e.c.c.JoinHelper       ] [rdy-elastic2] failed to join {rdy-elastic5}{I8mg9PsaS9uOxsxTi6xaDg}{BW2CsP8-SBGH3gXpbUfv3A}{10.148.0.28}{10.148.0.28:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional.empty}
> org.elasticsearch.transport.RemoteTransportException: [rdy-elastic5][10.148.0.28:9300][internal:cluster/coordination/join]
> Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
>         at org.elasticsearch.cluster.coordination.Coordinator$3.onFailure(Coordinator.java:500) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.cluster.coordination.JoinHelper$5.handleException(JoinHelper.java:359) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1124) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.transport.TransportService$8.run(TransportService.java:966) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
>         at java.lang.Thread.run(Thread.java:835) [?:?]
> Caused by: org.elasticsearch.transport.NodeDisconnectedException: [rdy-elastic2][10.148.0.25:9300][internal:cluster/coordination/join/validate] disconnected
> [2019-05-02T07:34:52,283][INFO ][o.e.c.c.JoinHelper       ] [rdy-elastic2] failed to join {rdy-elastic5}{I8mg9PsaS9uOxsxTi6xaDg}{BW2CsP8-SBGH3gXpbUfv3A}{10.148.0.28}{10.148.0.28:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional.empty}
> org.elasticsearch.transport.RemoteTransportException: [rdy-elastic5][10.148.0.28:9300][internal:cluster/coordination/join]
> Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
>         at org.elasticsearch.cluster.coordination.Coordinator$3.onFailure(Coordinator.java:500) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.cluster.coordination.JoinHelper$5.handleException(JoinHelper.java:359) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1124) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.transport.TransportService$8.run(TransportService.java:966) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.0.0.jar:7.0.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
>         at java.lang.Thread.run(Thread.java:835) [?:?]
> Caused by: org.elasticsearch.transport.NodeDisconnectedException: [rdy-elastic2][10.148.0.25:9300][internal:cluster/coordination/join/validate] disconnected
> [2019-05-02T07:34:52,295][WARN ][o.e.t.OutboundHandler    ] [rdy-elastic2] send message failed [channel: Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/10.148.0.28:58774}]

Sorry for the 1st post.

What did I do wrong?
Thank you.

DavidTurner · May 2, 2019, 7:56am

This is telling us that rdy-elastic2 is trying to join the master node rdy-elastic5, but rdy-elastic5 failed to connect back to rdy-elastic2 for some reason. Perhaps this is a networking issue? It'd help to see the corresponding logs from the master.

Could you use the </> button to format your logs like I've done above? It makes it much easier to read them, which makes it much more likely you'll get an answer to your question.

rozzaaq · May 2, 2019, 8:05am

Thank you for your instant reply.

Yes I'll try to format it correctly

[2019-05-02T07:47:06,061][INFO ][o.e.c.s.MasterService    ] [rdy-elastic5] node-join[{rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true} join existing leader], term: 1, version: 2286, reason: added {{rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true},}
[2019-05-02T07:47:06,076][INFO ][o.e.c.s.ClusterApplierService] [rdy-elastic5] added {{rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true},}, term: 1, version: 2286, reason: Publication{term=1, version=2286}
[2019-05-02T07:47:06,134][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [rdy-elastic5] failed to execute on node [l90J7FcPRmGIUL_MIdp-HA]
org.elasticsearch.transport.RemoteTransportException: [rdy-elastic2][10.148.0.25:9300][cluster:monitor/nodes/stats[n]]
Caused by: org.elasticsearch.ElasticsearchSecurityException: missing authentication credentials for action [cluster:monitor/nodes/stats[n]]
        at org.elasticsearch.xpack.core.security.support.Exceptions.authenticationError(Exceptions.java:18) ~[?:?]
        at org.elasticsearch.xpack.core.security.authc.DefaultAuthenticationFailureHandler.createAuthenticationError(DefaultAuthenticationFailureHandler.java:154) ~[?:?]
        at org.elasticsearch.xpack.core.security.authc.DefaultAuthenticationFailureHandler.missingToken(DefaultAuthenticationFailureHandler.java:109) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$AuditableTransportRequest.anonymousAccessDenied(AuthenticationService.java:650) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$handleNullToken$19(AuthenticationService.java:466) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.handleNullToken(AuthenticationService.java:471) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.consumeToken(AuthenticationService.java:355) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$extractToken$9(AuthenticationService.java:326) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.extractToken(AuthenticationService.java:344) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$checkForApiKey$3(AuthenticationService.java:287) ~[?:?]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) ~[elasticsearch-7.0.1.jar:7.0.1]
        at org.elasticsearch.xpack.security.authc.ApiKeyService.authenticateWithApiKeyIfPresent(ApiKeyService.java:345) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.checkForApiKey(AuthenticationService.java:268) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$0(AuthenticationService.java:251) ~[?:?]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) ~[elasticsearch-7.0.1.jar:7.0.1]
        at org.elasticsearch.xpack.security.authc.TokenService.getAndValidateToken(TokenService.java:310) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$2(AuthenticationService.java:247) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$lookForExistingAuthentication$6(AuthenticationService.java:305) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lookForExistingAuthentication(AuthenticationService.java:316) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.authenticateAsync(AuthenticationService.java:243) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.access$000(AuthenticationService.java:195) ~[?:?]
        at org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:138) ~[?:?]
        at org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile.inbound(ServerTransportFilter.java:121) ~[?:?]
        at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:307) ~[?:?]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.0.1.jar:7.0.1]
        at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1077) ~[elasticsearch-7.0.1.jar:7.0.1]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-7.0.1.jar:7.0.1]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.0.1.jar:7.0.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:835) [?:?]
[2019-05-02T07:47:07,721][INFO ][o.e.c.s.MasterService    ] [rdy-elastic5] node-left[{rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true} disconnected], term: 1, version: 2288, reason: removed {{rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true},}
[2019-05-02T07:47:07,735][INFO ][o.e.c.s.ClusterApplierService] [rdy-elastic5] removed {{rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true},}, term: 1, version: 2288, reason: Publication{term=1, version=2288}

Does this chunk of log helps?

I installed fresh elasticsearch on GCP (Java 8, Elasticsearch, no Nginx / UFW), set the same configuration, start one node give some interval before starting the next nodes.

Again, sorry, I couldn't comprehend what did I do wrong?

DavidTurner · May 2, 2019, 8:12am

Thanks, that's helpful.

rozzaaq:

[2019-05-02T07:47:06,076][INFO ][o.e.c.s.ClusterApplierService] [rdy-elastic5] added {{rdy-elastic2}{l90J7FcPRmGIUL_MIdp-HA}{NSmiqpqORpCxJ_HsqUBFDQ}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true},}, term: 1, version: 2286, reason: Publication{term=1, version=2286}
[2019-05-02T07:47:06,134][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [rdy-elastic5] failed to execute on node [l90J7FcPRmGIUL_MIdp-HA]
org.elasticsearch.transport.RemoteTransportException: [rdy-elastic2][10.148.0.25:9300][cluster:monitor/nodes/stats[n]]
Caused by: org.elasticsearch.ElasticsearchSecurityException: missing authentication credentials for action [cluster:monitor/nodes/stats[n]]

These four lines tell us that rdy-elastic2 did actually manage to join the cluster, but then rdy-elastic5 sent it a stats request and rdy-elastic2 rejected it due to missing authentication credentials. It sounds like there's some kind of mismatch in their respective security configurations.

rozzaaq · May 6, 2019, 9:15am

It turns out that I did extra enter whitespace. After I delete that, clustering is working fine.

Just one more case.

I tried to add data node, with identical configuration:

cluster.name: rdy-elastic
node.name: rdy-elastic4
network.host: _site_
discovery.seed_hosts:
        - 10.148.0.25
        - 10.148.0.26
        - 10.148.0.28
xpack.security.enabled: true
node.master: false
node.data: true
node.ingest: false

rozzaaq · May 6, 2019, 9:16am

But it failed joining the cluster, here's the log:

[2019-05-06T06:54:01,237][WARN ][o.e.c.c.ClusterFormationFailureHelper] [rdy-elastic4] master not discovered yet: have discovered [{rdy-elastic2}{7XA9hvgEQhm4zqt7hqRp1g}{50XzLOuJSq6EdUMoJDq-vg}{10.148.0.25}{10.148.0.25:9300}{ml.machine
_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true}, {rdy-elastic3}{4CuiSlBRRz2aQnL3V2sXUw}{hdTXY3_RTxezgLFvrvlblA}{10.148.0.26}{10.148.0.26:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true}, {rd
y-elastic5}{FPllsKviQiGkVylVgPds0A}{8AUnGUS6TeGPHGhXfhnnRQ}{10.148.0.28}{10.148.0.28:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true}]; discovery will continue using [10.148.0.25:9300, 10.148.0.26:9300, 10
.148.0.28:9300] from hosts providers and [{rdy-elastic4}{CK-0qbxfTOWNi7oTau93xQ}{kX9FE0MbToSfTYrFOMj_yw}{10.148.0.27}{10.148.0.27:9300}{ml.machine_memory=7836008448, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster s
tate; node term 2, last-accepted version 0 in term 0
[2019-05-06T06:54:11,239][WARN ][o.e.c.c.ClusterFormationFailureHelper] [rdy-elastic4] master not discovered yet: have discovered [{rdy-elastic2}{7XA9hvgEQhm4zqt7hqRp1g}{50XzLOuJSq6EdUMoJDq-vg}{10.148.0.25}{10.148.0.25:9300}{ml.machine
_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true}, {rdy-elastic3}{4CuiSlBRRz2aQnL3V2sXUw}{hdTXY3_RTxezgLFvrvlblA}{10.148.0.26}{10.148.0.26:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true}, {rd
y-elastic5}{FPllsKviQiGkVylVgPds0A}{8AUnGUS6TeGPHGhXfhnnRQ}{10.148.0.28}{10.148.0.28:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true}]; discovery will continue using [10.148.0.25:9300, 10.148.0.26:9300, 10
.148.0.28:9300] from hosts providers and [{rdy-elastic4}{CK-0qbxfTOWNi7oTau93xQ}{kX9FE0MbToSfTYrFOMj_yw}{10.148.0.27}{10.148.0.27:9300}{ml.machine_memory=7836008448, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster s
tate; node term 2, last-accepted version 0 in term 0
[2019-05-06T06:54:11,271][WARN ][o.e.n.Node               ] [rdy-elastic4] timed out while waiting for initial discovery state - timeout: 30s
[2019-05-06T06:54:11,283][INFO ][o.e.h.AbstractHttpServerTransport] [rdy-elastic4] publish_address {10.148.0.27:9200}, bound_addresses {10.148.0.27:9200}
[2019-05-06T06:54:11,283][INFO ][o.e.n.Node               ] [rdy-elastic4] started

rozzaaq · May 6, 2019, 9:17am

[2019-05-06T06:54:11,538][INFO ][o.e.c.c.JoinHelper       ] [rdy-elastic4] failed to join {rdy-elastic2}{7XA9hvgEQhm4zqt7hqRp1g}{50XzLOuJSq6EdUMoJDq-vg}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, 
xpack.installed=true} with JoinRequest{sourceNode={rdy-elastic4}{CK-0qbxfTOWNi7oTau93xQ}{kX9FE0MbToSfTYrFOMj_yw}{10.148.0.27}{10.148.0.27:9300}{ml.machine_memory=7836008448, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Opti
onal[Join{term=2, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={rdy-elastic4}{CK-0qbxfTOWNi7oTau93xQ}{kX9FE0MbToSfTYrFOMj_yw}{10.148.0.27}{10.148.0.27:9300}{ml.machine_memory=7836008448, xpack.installed=true, ml.max_open_jobs=
20}, targetNode={rdy-elastic2}{7XA9hvgEQhm4zqt7hqRp1g}{50XzLOuJSq6EdUMoJDq-vg}{10.148.0.25}{10.148.0.25:9300}{ml.machine_memory=1771659264, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.RemoteTransportException: [rdy-elastic2][10.148.0.25:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.ConnectTransportException: [rdy-elastic4][10.148.0.27:9300] connect_exception
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1299) ~[elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:99) ~[elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-7.0.0.jar:7.0.0]
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2159) ~[?:?]
        at org.elasticsearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:57) ~[elasticsearch-core-7.0.0.jar:7.0.0]
        at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$new$1(Netty4TcpChannel.java:72) ~[transport-netty4-client-7.0.0.jar:7.0.0]
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:269) ~[netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:474) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
        at java.lang.Thread.run(Thread.java:835) [?:?]
Caused by: java.io.IOException: connection timed out: 10.148.0.27/10.148.0.27:9300
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:267) ~[netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:474) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[?:?]
        at java.lang.Thread.run(Thread.java:835) ~[?:?]

What did I do wrong?

rozzaaq · May 6, 2019, 9:20am

And this is the cluster:

curl -XGET  10.148.0.25:9200/_cat/nodes
10.148.0.26 10 94 0 0.00 0.00 0.00 mdi - rdy-elastic3
10.148.0.25 13 95 0 0.00 0.00 0.00 mdi * rdy-elastic2
10.148.0.28  7 94 0 0.00 0.00 0.00 mdi - rdy-elastic5

DavidTurner · May 6, 2019, 9:23am

rozzaaq:

org.elasticsearch.transport.RemoteTransportException:
[rdy-elastic2][10.148.0.25:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.ConnectTransportException:
[rdy-elastic4][10.148.0.27:9300] connect_exception
...
Caused by: java.io.IOException: connection timed out: 10.148.0.27/10.148.0.27:9300

rdy-elastic4 managed to connect to rdy-elastic2 but rdy-elastic2 cannot connect back to rdy-elastic4. The connection is timing out. Sometimes this is because the traffic is blocked by a firewall or other traffic filter.

rozzaaq · May 13, 2019, 8:01am

Please help, I still couldn't found the solution, here are the logs:

data node log
https://gist.github.com/rozzaaq/450039410ec4cfe08d40f121aa52566d
master node log
https://gist.github.com/rozzaaq/b21e67d7a8d40d7dfdd48ae4e54d6c1a

Am I missing something, both are GCP Ubuntu 18.04 instance with no UFW / Nginx installed

DavidTurner · May 13, 2019, 8:14am

Caused by: org.elasticsearch.ElasticsearchSecurityException: missing authentication credentials for action [cluster:monitor/nodes/stats[n]]

This suggests there is still something wrong with your security configuration. Is security enabled on all nodes? Make sure the configs match.

rozzaaq · May 13, 2019, 8:47am

Yes I set
xpack.security.enabled: true
For all of the nodes

The 3 node cluster went up successfully, I want to add 1 data node.
But the data node always failed to join.

rozzaaq · May 17, 2019, 7:18am

At last, I successfully added new data node, what I do different is, I set all the nodes configuration to
xpack.security.enabled: false.

I'll try the true value next.

Thank you for your assistance.

ellison001 · May 31, 2019, 8:43am

Hi, I met the similar issue with ES 7.1.0. I have three nodes with the same configuration file. each two can form a cluster, when the third node join the cluster, the error are the same as you pasted.

BTW: the three node are in the same network and the firewalld is inactive on those three nodes.

my configuration is as below.

cluster.name: dev-es
node.name: es-01
network.host: 0.0.0.0
discovery.zen.ping.unicast.hosts:
- 192.168.0.81
- 192.168.0.82
- 192.168.0.83
cluster.initial_master_nodes:
- 192.168.0.81
- 192.168.0.82
- 192.168.0.83
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: false

system · June 28, 2019, 8:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Node will not join cluster Elasticsearch	12	4442	December 9, 2019
The third node cannot join the cluster (ES-7.1.0) Elasticsearch	9	2079	July 3, 2019
Node failed to join Elasticsearch	2	4724	August 4, 2020
Node does not join existing cluster (ES 7.2) Elasticsearch	3	5343	August 9, 2019
Unable to join nodes to cluster Elasticsearch	6	765	September 10, 2020

Elasticsearch 7, 3 nodes cluster setup, 1 failed to join

Related topics