Filebeat : Badly Stuck with Connection reset by peer error

Hi, I'm badly stuck with connection reset by peer error even though all tcp ports are opened. Can anyone guide me here.

2016-11-22T13:32:39Z INFO Error publishing events (retrying): read tcp> read: connection reset by peer
2016-11-22T13:32:39Z INFO send fail
2016-11-22T13:32:39Z INFO backoff retry: 2s

which versions of filebeat and logstash are you using?

Have you check logstash logs?

Here is the logstash logs,

{:timestamp=>"2016-11-22T15:43:44.577000+0000", :message=>"Cannot get new connection from pool.", :class=>"Elasticsearch::Transport::Transport::Error", :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/base.rb:193:in perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/http/manticore.rb:54:inperform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/sniffer.rb:32:in hosts'", "org/jruby/ext/timeout/'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/sniffer.rb:31:in hosts'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/base.rb:76:inreload_connections!'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.5.5-java/lib/logstash/outputs/elasticsearch/http_client.rb:72:in sniff!'", java/lib/logstash/output_delegator.rb:130:inworker_multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/output_delegator.rb:114:in multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:293:inoutput_batch'", "org/jruby/ each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:293:inoutput_batch'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:224:in worker_loop'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:193:instart_workers'"], :client_config=>{:hosts=>[""], :ssl=>nil, :transport_options=>{:socket_timeout=>0, :request_timeout=>0, :proxy=>nil, :ssl=>{}}, :transport_class=>Elasticsearch::Transport::Transport::HTTP::Manticore, :logger=>nil, :tracer=>nil, :reload_connections=>false, :retry_on_failure=>false, :reload_on_failure=>false, :randomize_hosts=>false}, :level=>:error}

logstash 2.2.4
filebeat version 1.3.1

this looks like a problem with logstash outputting to elasticsearch. Which logstash-input-beats plugin version is installed?

Logstash has timeout on beats connections, in case of logstash pipeline being blocked. After a while logstash will close/reset connections. See congestion_threshold setting. Most recent plugin version removes the auto-closing if internal pipelines are closed.

logstash-input-beats-2.2.7 is the version

First of all, filebeat only not sending the events.

2016-11-22T16:43:04Z DBG Try to publish 932 events to logstash with window size 1
2016-11-22T16:43:04Z DBG close connection
2016-11-22T16:43:04Z DBG 0 events out of 932 events sent to logstash. Continue sending ...
2016-11-22T16:43:04Z INFO Error publishing events (retrying): EOF
2016-11-22T16:43:04Z INFO send fail
2016-11-22T16:43:04Z INFO backoff retry: 1m0s

Also confused with below error in logstash.

:message=>"Cannot get new connection from pool.", :class=>"Elasticsearch::Transport::Transport::Error"

Missed one more information, we're using Elasticsearch service from AWS.

@Robert_Firek , Could you help here?

As per @tomwj post Elasitcsearch-ruby raises "Cannot get new connection from pool" error

I tried by removing sniffing => true.

I'm able to insert data to elastic search by using nc command through tcp 5044 port with below config.

[ec2-user@ip-10-3-1-13 conf.d]$ nc 5044

Config for testing:

input {
tcp {
port => 5044

So, there is no issues with logstash and ES end now.

Still able to see below error in filebeat end and suspecting the logs are not reaching logstash from filebeat.

2016-11-22T18:32:22Z DBG Try to publish 932 events to logstash with window size 1
2016-11-22T18:32:22Z DBG close connection
2016-11-22T18:32:22Z DBG 0 events out of 932 events sent to logstash. Continue sending ...
2016-11-22T18:32:22Z INFO Error publishing events (retrying): EOF
2016-11-22T18:32:22Z INFO send fail
2016-11-22T18:32:22Z INFO backoff retry: 1m0s

Can you still check logstash logs? EOF means End Of File. You get this if connection has been closed by remote (connection closed by logstash). See my earlier post about setting the congestion threshold in logstash to not close connections.

Filebeat is trying to send events, but logstash is not ACKing them. Due to internals in plugin version you're using, logstash seems to close the connection.

Consider updating the beats input plugin.

No, there is no issues with logstash server. I tried one more workaround. Launched a machine in Public subnet and installed filebeat, configured to send logs to same logstash server and started agent. This is working without any issues and i'm able to see the logs in Kibana.

But facing issue with the filebeat agents which are installed in private subnet app servers. There is no port issues between these app servers and logstash server and still seeing same error in filebeat logs.

No errors in logstash logs.

Your problem looks very similar to my problem (Elasitcsearch-ruby raises "Cannot get new connection from pool" error), but maybe it is now problem in different plugin.

I found my error by finding the exact place in the code where my exception was thrown (I would suggest to do the same). Next I modified the code to retrieve additional information about the connection (url, parameters, used libraries etc.). I compiled the code and deployed on the server. I repeated these steps few time until I found the reasons . In my case I found the code which depends on the API call which is not supported by AWS in the same way as Elasticsearch.

It is a little bit primitive way of doing it, but only in such a way I was able to trace the root cause of the problem. Fortunately all my code was open sourced.

I just found this bug report in logstash beats input plugin:

Seems some fix should be available in most recent plugin (version 3.1.10). Can you update the beats plugin and set client_inactivity_timeout in logstash to something very big (or 0 to disable logstash disconnecting clients) and see if this fixes your issue?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.