Problems getting new node to join cluster

Russell_Fulton · June 13, 2021, 11:29pm

I am trying to add a couple of nodes to an existing cluster of 3 nodes. These "new" nodes have been in the cluster in the past but were taken out for "refurbishment" and now I am having issues getting them back in.

seeing this in the logs of the machine that is trying to join:

     [2021-06-14T10:08:11,368][WARN ][o.e.c.c.JoinHelper       ] [secesprd05] last failed join attempt was 7.7s ago, failed to join {secesprd01}{kAWPcpoxSNSN9WlUsYlQlg}{pKGIqAxXRTy4NHxIq2HgwA}{10.6.0.67}{10.6.0.67:9300}{cdhmw}{xpack.installed=true, molochtype=hot, transform.node=false} with JoinRequest{sourceNode={secesprd05}{4cPiEfloRoKgvx-NqVp4aA}{fnNOdzfIT66oVhACJLUptg}{130.216.236.212}{130.216.236.212:9300}{c}{xpack.installed=true, molochtype=none, transform.node=false}, minimumTerm=21, optionalJoin=Optional.empty}
    org.elasticsearch.transport.RemoteTransportException: [secesprd01][10.6.0.67:9300][internal:cluster/coordination/join]
    Caused by: org.elasticsearch.transport.ConnectTransportException: [secesprd05].   [130.216.236.212:9300] connect_timeout[30s]
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:984) ~[elasticsearch-7.10.1.jar:7.10.1]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678) ~[elasticsearch-7.10.1.jar:7.10.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]
    [2021-06-14T10:08:11,369][WARN ][o.e.c.c.ClusterFormationFailureHelper] [secesprd05] master not discovered yet: have discovered [{secesprd05}{4cPiEfloRoKgvx-NqVp4aA}{fnNOdzfIT66oVhACJLUptg}{130.216.236.212}{130.216.236.212:9300}{c}{xpack.installed=true, molochtype=none, transform.node=false}, {secesprd01}{kAWPcpoxSNSN9WlUsYlQlg}{pKGIqAxXRTy4NHxIq2HgwA}{10.6.0.67}{10.6.0.67:9300}{cdhmw}{xpack.installed=true, molochtype=hot, transform.node=false}, {secesprd02}{6UDagJW2T3eWM-0PQJ0rMA}{HLQJOMv1SpOCPfcJZqe2dg}{10.6.0.68}{10.6.0.68:9300}{cdhmw}{xpack.installed=true, molochtype=hot, transform.node=false}, {secmonprd07}{TNHldGyAQ52sNlIbGPbgMg}{QQ3Iau6fQaKF4-eqT6oGDQ}{130.216.5.111}{130.216.5.111:9300}{dmw}{xpack.installed=true, molochtype=warm, transform.node=false}]; discovery will continue using [10.6.0.67:9300, 10.6.0.68:9300, 130.216.5.111:9300] from hosts providers and [] from last-known cluster state; node term 21, last-accepted version 181555 in term 6

I.e. it is complaining about timeout.

When I run tcpdump I can see that the nodes are communicating on port 9300 with and there are no obvious errors or timeouts. (Pcaps available on request).

There are no entries in the logs of the other nodes that indicate anything amiss (actually there are no logs for the time period at all).

I have tried restarting one of the existing nodes but that made no difference.

Both the new nodes show identical symptoms.

At a loss as to what to check next.

DavidTurner · June 14, 2021, 5:58am

Russell_Fulton:

    org.elasticsearch.transport.RemoteTransportException: [secesprd01][10.6.0.67:9300][internal:cluster/coordination/join]
    Caused by: org.elasticsearch.transport.ConnectTransportException: [secesprd05].   [130.216.236.212:9300] connect_timeout[30s]

Note that the connection timeout is for a connection from secesprd01 to secesprd05. The other direction is working ok.

system · July 12, 2021, 5:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New nodes failing to join cluster Elasticsearch	3	650	September 8, 2021
Elasticsearch Cluster Addition of new node to existing cluster Elasticsearch	1	523	March 21, 2019
After upgrading ES version, new node cannot join to the cluster Elasticsearch	1	311	February 23, 2022
Loosing node connection after update to 7.9.2 Elasticsearch	1	420	November 5, 2020
Failed to join cluster Elasticsearch	5	9838	July 25, 2019

Problems getting new node to join cluster

Related topics