Filebeat connection refused eof

We use filebeat -> logstash to transport logs to graylog. As soon as filebeat logs:

Error publishing events (retrying): EOF

it is followed by:

Connecting error publishing events (retrying): dial tcp y.y.y.y:5043: getsockopt: connection refused

and connection to logstash is lost, and status check on logstash returns "not running". Logstash logs something like:

{:timestamp=>"2016-03-08T16:08:27.658000+0100", :message=>"Beats Input: Remote connection closed", :peer=>"x.x.x.x", :exception=>#, :level=>:warn}

x.x.x.x and y.y.y.y ping each other.

After that point, there are no incoming logs on graylog. Once we start logstash again, everything is ok until new EOF arises.

We use logstash-2.2.2-1.noarch with beat plugin 2.1.4, running on CentOS release 6.6, and filebeat version 1.1.1 (amd64) on CentOS release 6.4 (Final).

Output configuration on filebeat is:
logstash: # The Logstash hosts hosts: ["y.y.y.y:5043"]
with one worker and no tls.

Input configuration on logstash is:
input { beats { port => 5043 } }

Any help appreciated.

There are two open and one closed issue related to this in logstash-input-beats: https://github.com/logstash-plugins/logstash-input-beats/issues?utf8=✓&q=is%3Aissue++Remote+connection Please check if one of these matches your pattern.

Thanks ruflin,

I checked these issues, but didn't find any solution. We already use input beats plugin 2.1.4. We have plain tcp between filebeat and logstash, no ssl, and no proxy. We tried to downgrade to logstash 2.1.9 as suggested in closed issue, but have found only 2.1.3, with no results after downgrade. I turned debug log level on filebeat, and here's what we noticed:

2016-03-11T15:22:43+01:00 DBG full line read
2016-03-11T15:22:43+01:00 DBG End of file reached: /usr/local/mpg/log/mpg.log; Backoff now.
2016-03-11T15:22:43+01:00 DBG connect
2016-03-11T15:22:43+01:00 INFO Connecting error publishing events (retrying): dial tcp x.x.x.x:5043: getsockopt: connection refused
2016-03-11T15:22:43+01:00 INFO send fail
2016-03-11T15:22:43+01:00 INFO backoff retry: 8s
2016-03-11T15:22:44+01:00 DBG Flushing spooler because of timeout. Events flushed: 654
2016-03-11T15:22:44+01:00 DBG full line read

There was no connection to logstash for only short period of time (reason unknown - EOF?), and after backoff, new lines are found. Logstash is running, nothing wrong.

2016-03-11T15:22:44+01:00 DBG End of file reached: /usr/local/mpg/log/mpg.log; Backoff now.
2016-03-11T15:22:45+01:00 DBG full line read

Yet another EOF, but new line is found immediately after. Logstash is running.

2016-03-11T15:22:50+01:00 DBG full line read
2016-03-11T15:22:51+01:00 DBG connect
2016-03-11T15:22:51+01:00 INFO Connecting error publishing events (retrying): dial tcp 10.50.110.20:5043: getsockopt: connection refused
2016-03-11T15:22:51+01:00 INFO send fail
2016-03-11T15:22:51+01:00 INFO backoff retry: 16s
2016-03-11T15:22:59+01:00 DBG Start next scan
2016-03-11T15:22:59+01:00 DBG scan path /usr/local/mpg/log/mpg.log
2016-03-11T15:22:59+01:00 DBG Check file for harvesting: /usr/local/mpg/log/mpg.log
2016-03-11T15:22:59+01:00 DBG Update existing file for harvesting: /usr/local/mpg/log/mpg.log
2016-03-11T15:22:59+01:00 DBG Not harvesting, file didn't change: /usr/local/mpg/log/mpg.log
2016-03-11T15:23:07+01:00 DBG connect
2016-03-11T15:23:07+01:00 INFO Connecting error publishing events (retrying): dial tcp 10.50.110.20:5043: getsockopt: connection refused
2016-03-11T15:23:07+01:00 INFO send fail
2016-03-11T15:23:07+01:00 INFO backoff retry: 32s
2016-03-11T15:23:09+01:00 DBG Start next scan
2016-03-11T15:23:09+01:00 DBG scan path /usr/local/mpg/log/mpg.log
2016-03-11T15:23:09+01:00 DBG Check file for harvesting: /usr/local/mpg/log/mpg.log
2016-03-11T15:23:09+01:00 DBG Update existing file for harvesting: /usr/local/mpg/log/mpg.log
2016-03-11T15:23:09+01:00 DBG Not harvesting, file didn't change: /usr/local/mpg/log/mpg.log

No connection to logstash once again, but this time no harvesting as well, file didn't change (EOF again?). After two backoffs, file still didn't change. At this point, a checked status of logstash returned 'not running', it was shutdown.

If I understand you right, in the end Logstash was running anymore, means it shutdown / crashed without doing anything on the Logstash side?

That's right, logstash shutdown, and the only thing in the log indicating any sign of break down is:

{:timestamp=>"2016-03-08T16:08:27.658000+0100", :message=>"Beats Input: Remote connection closed", :peer=>"x.x.x.x", :exception=>#, :level=>:warn}

ruflin,
Thanks for the effort, problem was on our side with linux user privileges. Administrator solved the mystery. Filebeat rocks now :slight_smile:

@cernicb Good to hear. Thanks for also posting the "solution".