Elasticsearch nodes automatically disconnected

Hi Team,

we have an 3 nodes(data cum master) cluster,it has 487 primary indices and one replica for the same.

ES version : 7.3.2

but sometimes nodes get disconnected and joined later.it happens 2 to 3 times in a month,myself attached the logs related to that.


[2021-07-08T12:30:53,193][INFO ][o.e.c.c.C.CoordinatorPublication] [node1] after [10s] publication of cluster state version [1494021] is still waiting for {node2}{tpjmH236RH6ovwUlUJ-ZHg}{-5ByFd1jQwa9EWiz8Qx1Kg}{IP2}{IP2:PORT2}{dm}{xpack.installed=true} [SENT_PUBLISH_REQUEST]
[2021-07-08T12:31:02,762][INFO ][o.e.m.j.JvmGcMonitorService] [node1] [gc][2878073] overhead, spent [315ms] collecting in the last [1s]
[2021-07-08T12:31:13,197][WARN ][o.e.c.c.C.CoordinatorPublication] [node1] after [30s] publication of cluster state version [1494021] is still waiting for {node2}{tpjmH236RH6ovwUlUJ-ZHg}{-5ByFd1jQwa9EWiz8Qx1Kg}{IP2}{IP2:PORT2}{dm}{xpack.installed=true} [SENT_PUBLISH_REQUEST]
[2021-07-08T12:31:23,303][INFO ][o.e.c.c.C.CoordinatorPublication] [node1] after [10s] publication of cluster state version [1494022] is still waiting for {node1}{Ul0e8cWAT3G88k3T0JE2vA}{eR_VQt9JSPaAAEfubqpwSg}{IP1}{IP1:PORT1}{dm}{xpack.installed=true} [WAITING_FOR_QUORUM], {node3}{crZRSEZoTK6yezvAKlKf2g}{EV1zweQOQiKzkN6QaZpoFQ}{IP3}{IP3:PORT3}{dm}{xpack.installed=true} [SENT_PUBLISH_REQUEST], {node2}{tpjmH236RH6ovwUlUJ-ZHg}{-5ByFd1jQwa9EWiz8Qx1Kg}{IP2}{IP2:PORT2}{dm}{xpack.installed=true} [SENT_PUBLISH_REQUEST]
[2021-07-08T12:31:30,461][WARN ][o.e.t.TransportService   ] [node1] Received response for a request that has timed out, sent [46716ms] ago, timed out [36718ms] ago, action [internal:coordination/fault_detection/follower_check], node [{node3}{crZRSEZoTK6yezvAKlKf2g}{EV1zweQOQiKzkN6QaZpoFQ}{IP3}{IP3:PORT3}{dm}{xpack.installed=true}], id [61301438]
[2021-07-08T12:31:30,461][WARN ][o.e.t.TransportService   ] [node1] Received response for a request that has timed out, sent [35718ms] ago, timed out [25751ms] ago, action [internal:coordination/fault_detection/follower_check], node [{node3}{crZRSEZoTK6yezvAKlKf2g}{EV1zweQOQiKzkN6QaZpoFQ}{IP3}{IP3:PORT3}{dm}{xpack.installed=true}], id [61301614]
[2021-07-08T12:31:30,461][WARN ][o.e.t.TransportService   ] [node1] Received response for a request that has timed out, sent [24747ms] ago, timed out [14863ms] ago, action [internal:coordination/fault_detection/follower_check], node [{node3}{crZRSEZoTK6yezvAKlKf2g}{EV1zweQOQiKzkN6QaZpoFQ}{IP3}{IP3:PORT3}{dm}{xpack.installed=true}], id [61301769]
[2021-07-08T12:31:30,526][INFO ][o.e.c.s.ClusterApplierService] [node1] master node changed {previous [{node1}{Ul0e8cWAT3G88k3T0JE2vA}{eR_VQt9JSPaAAEfubqpwSg}{IP1}{IP1:PORT1}{dm}{xpack.installed=true}], current []}, term: 68, version: 1494021, reason: becoming candidate: joinLeaderInTerm
[2021-07-08T12:31:30,596][WARN ][o.e.c.s.MasterService    ] [node1] failing   cluster state version [1494022]
org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: publication failed
	at org.elasticsearch.cluster.coordination.Coordinator$CoordinatorPublication$4.onFailure(Coordinator.java:1429) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.ActionRunnable.onFailure(ActionRunnable.java:60) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:225) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.ListenableFuture.notifyListener(ListenableFuture.java:93) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.ListenableFuture.addListener(ListenableFuture.java:55) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Coordinator$CoordinatorPublication.onCompletion(Coordinator.java:1349) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Publication.onPossibleCompletion(Publication.java:125) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Publication.cancel(Publication.java:89) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Coordinator.cancelActivePublication(Coordinator.java:1126) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Coordinator.becomeCandidate(Coordinator.java:541) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Coordinator.joinLeaderInTerm(Coordinator.java:456) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Coordinator.ensureTermAtLeast(Coordinator.java:444) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.Coordinator.onFollowerCheckRequest(Coordinator.java:238) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.coordination.FollowersChecker$2.doRun(FollowersChecker.java:187) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.4.0.jar:7.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: org.elasticsearch.ElasticsearchException: publication cancelled before committing: become candidate: joinLeaderInTerm
	at org.elasticsearch.cluster.coordination.Publication.cancel(Publication.java:86) ~[elasticsearch-7.4.0.jar:7.4.0]
	... 11 more
[2021-07-08T12:31:30,970][WARN ][o.e.t.TcpTransport       ] [node1] exception caught on transport layer [Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:PORT1, remoteAddress=/IP3:33558}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLException: Received close_notify during handshake
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:475) ~[netty-codec-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283) ~[netty-codec-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1421) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:697) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:597) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:551) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:511) [netty-transport-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) [netty-common-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.38.Final.jar:4.1.38.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: javax.net.ssl.SSLException: Received close_notify during handshake
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1666) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1634) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1776) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1083) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781) ~[?:?]
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) ~[?:1.8.0_131]
	at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:282) ~[netty-handler-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1329) ~[netty-handler-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1224) ~[netty-handler-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1271) ~[netty-handler-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505) ~[netty-codec-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444) ~[netty-codec-4.1.38.Final.jar:4.1.38.Final]
	... 16 more

[2021-07-08T12:31:33,306][INFO ][o.e.c.s.ClusterApplierService] [node1] master node changed {previous [], current [{node3}{crZRSEZoTK6yezvAKlKf2g}{EV1zweQOQiKzkN6QaZpoFQ}{IP3}{IP3:PORT3}{dm}{xpack.installed=true}]}, term: 69, version: 1494022, reason: ApplyCommitRequest{term=69, version=1494022, sourceNode={node3}{crZRSEZoTK6yezvAKlKf2g}{EV1zweQOQiKzkN6QaZpoFQ}{IP3}{IP3:PORT3}{dm}{xpack.installed=true}}

Elsatic version:7.4.0

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.