Failed to publish events caused by: EOF

(Scott Vincent) #1


I am running filebeat v5 and logstash version 5 as well. I have the pretty much identical filebeat config files on the local server (the one with the rest of the ELK stack) as a remote server. I am getting logs locally just fine but whenever I bring a remote server into the picture, my filebeat log looks like:

2016-10-29T10:59:10-04:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.running=2 libbeat.publisher.published_events=2044 filebeat.harvester.started=2 filebeat.harvester.open_files=2
2016-10-29T10:59:11-04:00 ERR Failed to publish events caused by: EOF
2016-10-29T10:59:11-04:00 INFO Error publishing events (retrying): EOF
2016-10-29T10:59:40-04:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.published_but_not_acked_events=1024 libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.write_bytes=835 libbeat.logstash.publish.read_errors=1
2016-10-29T10:59:43-04:00 ERR Failed to publish events caused by: EOF
2016-10-29T10:59:43-04:00 INFO Error publishing events (retrying): EOF
2016-10-29T11:00:10-04:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.write_bytes=654 libbeat.logstash.publish.read_errors=1 libbeat.logstash.published_but_not_acked_events=1024 libbeat.logstash.call_count.PublishEvents=1
2016-10-29T11:00:40-04:00 INFO No non-zero metrics in the last 30s
2016-10-29T11:00:43-04:00 ERR Failed to publish events caused by: EOF
2016-10-29T11:00:43-04:00 INFO Error publishing events (retrying): EOF
2016-10-29T11:01:10-04:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_errors=1 libbeat.logstash.published_but_not_acked_events=1024 libbeat.logstash.publish.write_bytes=504

I've read posts with similar issues but nothing has seemed to fixed the issue. I tried different values for client_inactivity_timeout (0/300/900)., still not luck.

Here is my logstash config:

input {
    beats {
        port => 5044
        client_inactivity_timeout => 0
        ssl => true
        ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
        ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"

And my filebeat config:

- input_type: log
    - /var/log/messages
    - /
  document_type: syslog

  hosts: [""]
  bulk_max_size: 1024

  tls.certificate_authorities: ["/etc/pki/tls/certs/log-forwarder.crt"]

  to_syslog: false
  to_files: true
    path: /var/log/mybeat
    rotateeverybytes: 10485760 #10MB
    keepfiles: 7
  level: debug

Any help would be greatly appreciated.


(Steffen Siering) #2

Anything in Logstash logs? Is Logstash able to write events to it's output (e.g. try with stdout or null output in logstash)?

Besides logs, it might be interesting to run tcpdump on filebeat and logstash host to collect and compare connection setup/teardown on both hosts (sometimes firewalls+network equipment can mess with TCP).

For tcpdump try:

sudo tcpdump -w test.pcap -i <device> 'tcp[tcpflags] & (tcp-syn|tcp-fin|tcp-rst) != 0'

Add ip and/or port to tcpdump filter to reduce noise in test.pcap.

(Scott Vincent) #3

Thanks for the response. I ran tcpdump on both ends and it looks like they are talking fine. Here is a sample of what I am getting (.74 == elk).

I have firewalls turned off on both ends. I'm not really sure how to test if the networking equipment is messing things up but I don't think that's the issue here. I looked at the logstash logs before and there doesn't seem to be anything useful, just messages from starting/stopping the service. Here's a sample output:

root@b8bc1a257b3e:/# tail /var/log/logstash/logstash.log 
{:timestamp=>"2016-10-31T12:22:59.296000+0000", :message=>"Pipeline main started"}
{:timestamp=>"2016-10-31T13:02:43.128000+0000", :message=>"SIGTERM received. Shutting down the agent.", :level=>:warn}
{:timestamp=>"2016-10-31T13:02:43.164000+0000", :message=>"stopping pipeline", :id=>"main"}
{:timestamp=>"2016-10-31T13:02:43.816000+0000", :message=>"Pipeline main has been shutdown"}
{:timestamp=>"2016-10-31T09:40:24.103000+0000", :message=>"Pipeline main started"}
{:timestamp=>"2016-10-31T09:45:03.713000+0000", :message=>"SIGTERM received. Shutting down the agent.", :level=>:warn}
{:timestamp=>"2016-10-31T09:45:03.734000+0000", :message=>"stopping pipeline", :id=>"main"}
{:timestamp=>"2016-10-31T09:45:04.116000+0000", :message=>"Pipeline main has been shutdown"}
{:timestamp=>"2016-10-31T09:45:25.383000+0000", :message=>"Pipeline main started"}
{:timestamp=>"2016-10-31T10:06:29.379000+0000", :message=>"Pipeline main started"}

I turned on more verbose loggings and sent it to stdout, but not getting anything there either.

Adding pattern {"NGUSER"=>"%{NGUSERNAME}", :level=>:info}
Adding pattern {"NGINXACCESS"=>"%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \\[%{HTTPDATE:timestamp}\\] \"%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:\"(?:%{URI:referrer}|-)\"|%{QS:referrer}) %{QS:agent}", :level=>:info}
Starting pipeline {:id=>"main", :pipeline_workers=>1, :batch_size=>125, :batch_delay=>5, :max_inflight=>125, :level=>:info}
Pipeline main started

(Steffen Siering) #4

Is the trace from filebeat or logstash endpoint? Do both traces look exactly the same? Looks like logstash is closing the connection right on first byte received from beats. This can happen with TLS/SSL if any endpoint is not correctly configured. This points me to your beats config file: The tls section has been renamed to ssl in 5.0 GA. Change output.logstash.tls.certificate_authorities to output.logstash.ssl.certificate_authorities and you should be fine.

(Steffen Siering) #5 have you tried to run logstash in debug mode? Before fixing filebeat config, can you please try and check the logstash log output? I'm curious if it's of the log level no warning/error being printed by logstash.

(Scott Vincent) #6

Changing tls to ssl fixed it. I was learning on an older version the other day where tls was supported so I didn't realize it wasn't supported in the new version when I upgraded. Thanks a lot @steffens!

(system) #7

This topic was automatically closed after 21 days. New replies are no longer allowed.