this looks like a problem with logstash outputting to elasticsearch. Which logstash-input-beats plugin version is installed?
Logstash has timeout on beats connections, in case of logstash pipeline being blocked. After a while logstash will close/reset connections. See congestion_threshold setting. Most recent plugin version removes the auto-closing if internal pipelines are closed.
First of all, filebeat only not sending the events.
2016-11-22T16:43:04Z DBG Try to publish 932 events to logstash with window size 1
2016-11-22T16:43:04Z DBG close connection
2016-11-22T16:43:04Z DBG 0 events out of 932 events sent to logstash. Continue sending ...
2016-11-22T16:43:04Z INFO Error publishing events (retrying): EOF
2016-11-22T16:43:04Z INFO send fail
2016-11-22T16:43:04Z INFO backoff retry: 1m0s
So, there is no issues with logstash and ES end now.
Still able to see below error in filebeat end and suspecting the logs are not reaching logstash from filebeat.
2016-11-22T18:32:22Z DBG Try to publish 932 events to logstash with window size 1 2016-11-22T18:32:22Z DBG close connection 2016-11-22T18:32:22Z DBG 0 events out of 932 events sent to logstash. Continue sending ... 2016-11-22T18:32:22Z INFO Error publishing events (retrying): EOF 2016-11-22T18:32:22Z INFO send fail 2016-11-22T18:32:22Z INFO backoff retry: 1m0s
Can you still check logstash logs? EOF means End Of File. You get this if connection has been closed by remote (connection closed by logstash). See my earlier post about setting the congestion threshold in logstash to not close connections.
Filebeat is trying to send events, but logstash is not ACKing them. Due to internals in plugin version you're using, logstash seems to close the connection.
No, there is no issues with logstash server. I tried one more workaround. Launched a machine in Public subnet and installed filebeat, configured to send logs to same logstash server and started agent. This is working without any issues and i'm able to see the logs in Kibana.
But facing issue with the filebeat agents which are installed in private subnet app servers. There is no port issues between these app servers and logstash server and still seeing same error in filebeat logs.
I found my error by finding the exact place in the code where my exception was thrown (I would suggest to do the same). Next I modified the code to retrieve additional information about the connection (url, parameters, used libraries etc.). I compiled the code and deployed on the server. I repeated these steps few time until I found the reasons . In my case I found the code which depends on the API call which is not supported by AWS in the same way as Elasticsearch.
It is a little bit primitive way of doing it, but only in such a way I was able to trace the root cause of the problem. Fortunately all my code was open sourced.
Seems some fix should be available in most recent plugin (version 3.1.10). Can you update the beats plugin and set client_inactivity_timeout in logstash to something very big (or 0 to disable logstash disconnecting clients) and see if this fixes your issue?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.