Hi,
We keep seeing this issue intermittently in our cluster.. After the message failed error comes, the node gets removed from the cluster and the cluster state goes into red..
Immediately afterwards, the node gets added back into the cluster.
This happens a number of times, each hour, though there is no specified frequency. Is there a way we can work around this issue?
[2018-11-14T11:59:16,974][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [es-node1-001] send message failed [channel: NettyTcpChannel{localAddress=/173.37.96.31:9300, remoteAddress=/173.36.39.60:50426}]
javax.net.ssl.SSLException: SSLEngine closed already
at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2018-11-14T11:59:16,974][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [es-node1-001] send message failed [channel: NettyTcpChannel{localAddress=/173.37.96.31:9300, remoteAddress=/173.36.39.60:50426}]
javax.net.ssl.SSLException: SSLEngine closed already
at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2018-11-14T12:00:08,117][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [es-node1-001] exception caught on transport layer [NettyTcpChannel{localAddress=/173.37.96.31:9300, remoteAddress=/173.36.39.60:50672}], closing connection
[2018-11-14T12:07:02,344][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [es-node1-001] send message failed [channel: NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/173.36.39.60:50730}]
java.nio.channels.ClosedChannelException: null
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2018-11-14T12:08:31,298][INFO ][o.e.c.s.ClusterApplierService] [es-node1-001] removed {{es-node2-001}{V4A7hjvtQcyFW7BlxG-j4w}{3GXiTiZ_QfCfFtprDhA2og}{es-node2-001}{173.36.39.60:9300}{xpack.installed=true},}, reason: apply cluster state (from master [master {es-node1-002}{44pA9ErPTb-y3zOylW8Z_Q}{byFkKyl5S1a_s3RiujXyRA}{es-node1-002}{173.37.96.32:9300}{xpack.installed=true} committed version [148]])
[2018-11-14T12:08:31,809][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [es-node1-001] failed to execute on node [V4A7hjvtQcyFW7BlxG-j4w]
org.elasticsearch.transport.NodeDisconnectedException: [es-node2-001][173.36.39.60:9300][cluster:monitor/nodes/stats[n]] disconnected
[2018-11-14T12:08:58,991][INFO ][o.e.c.s.ClusterApplierService] [es-node1-001] added {{es-node2-001}{V4A7hjvtQcyFW7BlxG-j4w}{3GXiTiZ_QfCfFtprDhA2og}{es-node2-001}{173.36.39.60:9300}{xpack.installed=true},}, reason: apply cluster state (from master [master {es-node1-002}{44pA9ErPTb-y3zOylW8Z_Q}{byFkKyl5S1a_s3RiujXyRA}{es-node1-002}{173.37.96.32:9300}{xpack.installed=true} committed version [152]])