Something is not right and i would like some help figuring out what/why.
I have 3 ELK nodes (E 2.1.0
; L 2.1.1
), and every hour the KOPF plugin shows that they leave the cluster and rejoin after a few seconds.
They don't leave at the same time, there is an interval between them.
Currently the uptime
for my nodes is
elastic-node-01 : 13 minutes
elastic-node-02 : 28 minutes
elastic-node-03 : 15 minutes
the file /var/log/logstash/logstash.err
shows the following:
Dez 15, 2015 12:02:02 AM org.apache.http.impl.execchain.RetryExec execute
INFORMAÇÕES: I/O exception (java.net.SocketException) caught when processing request to {}->http://127.0.0.1:9200: Broken pipe
Dez 15, 2015 12:02:02 AM org.apache.http.impl.execchain.RetryExec execute
INFORMAÇÕES: Retrying request to {}->http://127.0.0.1:9200
Dez 15, 2015 1:01:54 AM org.apache.http.impl.execchain.RetryExec execute
INFORMAÇÕES: I/O exception (java.net.SocketException) caught when processing request to {}->http://127.0.0.1:9200: Broken pipe
Dez 15, 2015 1:01:54 AM org.apache.http.impl.execchain.RetryExec execute
INFORMAÇÕES: Retrying request to {}->http://127.0.0.1:9200
and /var/log/elasticsearch/myclustername.log
says:
2015-12-15 00:00:56,530[INFO ][discovery.zen ] [elastic-node-01] master_left [{elastic-node-03}{Va5P4dq8QkmbYPLjb2skCw}{10.20.30.163}{10.20.30.163:9300}], reason [shut_down]
2015-12-15 00:00:56,546[WARN ][discovery.zen ] [elastic-node-01] master left (reason = shut_down), current nodes: {{elastic-node-02}{nkZ50XnqTcOo4niZuchMyw}{10.20.30.162}{10.20.30.162:9300},{elastic-node-01}{xMOAjXAGRnucrXe8IDnwww}{10.20.30.161}{10.20.30.161:9300},}
2015-12-15 00:00:56,548[INFO ][cluster.service ] [elastic-node-01] removed {{elastic-node-03}{Va5P4dq8QkmbYPLjb2skCw}{10.20.30.163}{10.20.30.163:9300},}, reason: zen-disco-master_failed ({elastic-node-03}{Va5P4dq8QkmbYPLjb2skCw}{10.20.30.163}{10.20.30.163:9300})
2015-12-15 00:00:56,552[WARN ][discovery.zen.ping.unicast] [elastic-node-01] failed to send ping to [{elastic-node-03}{Va5P4dq8QkmbYPLjb2skCw}{10.20.30.163}{10.20.30.163:9300}]
RemoteTransportException[[elastic-node-03][10.20.30.163:9300][internal:discovery/zen/unicast]]; nested: IllegalStateException[received ping request while not started];
followed by a java exception:
Caused by: java.lang.IllegalStateException: received ping request while not started
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.handlePingRequest(UnicastZenPing.java:497)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.access$2400(UnicastZenPing.java:83)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$UnicastPingRequestHandler.messageReceived(UnicastZenPing.java:522)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$UnicastPingRequestHandler.messageReceived(UnicastZenPing.java:518)
at org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:244)
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:114)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:75)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
Any hints?