ERR Failed to publish events caused by: read tcp (filebeat Version: 5.5.1)

This is how my filebeat.yml looks like :

output.logstash:
  # The Logstash hosts
  hosts: ["55.194.113.9:5043"]
  bulk_max_size : 1048

The logstash instance transforms and indexes the logs into aws managed Elastic search.
I am parsing apache-logs , & the server traffic is very high as it is hosts logs for Channels/VOD/Movies watching customers.

Both Logstash and FIlebeat services have been up for more than 2 months and every thing was running fine when suddenly yesterday filebeat stopped forwarding the logs to logstash. The logstash logs look fine :

[2017-10-11T07:15:44,330][INFO ][logstash.outputs.amazones] New Elasticsearch output {:hosts=>["search-logs-es-test-abcasdsus-west-1.es.amazonaws.com"], :port=>443}
[2017-10-11T07:15:44,911][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>125}
[2017-10-11T07:15:46,087][INFO ][logstash.inputs.beats    ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5043"}
[2017-10-11T07:15:46,213][INFO ][logstash.pipeline        ] Pipeline main started
[2017-10-11T07:15:46,329][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600} 

And here's how my filebeat logs look like :

2017-10-11T07:18:44Z INFO Setup Beat: filebeat; Version: 5.5.1
2017-10-11T07:18:44Z INFO Max Retries set to: 3
2017-10-11T07:18:44Z INFO Activated logstash as output plugin.
2017-10-11T07:18:44Z INFO Publisher name: ip-10-123-23-222
2017-10-11T07:18:44Z INFO Flush Interval set to: 1s
2017-10-11T07:18:44Z INFO Max Bulk Size set to: 1048
2017-10-11T07:18:44Z INFO filebeat start running.
2017-10-11T07:18:44Z INFO Registry file set to: /var/lib/filebeat/registry
2017-10-11T07:18:44Z INFO Loading registrar data from /var/lib/filebeat/registry
2017-10-11T07:18:44Z INFO States Loaded from registrar: 7
2017-10-11T07:18:44Z INFO Loading Prospectors: 1
2017-10-11T07:18:44Z INFO Prospector with previous states loaded: 7
2017-10-11T07:18:44Z INFO Starting Registrar
2017-10-11T07:18:44Z INFO Starting prospector of type: log; id: 6220407458949788790
2017-10-11T07:18:44Z INFO Loading and starting Prospectors completed. Enabled prospectors: 1
2017-10-11T07:18:44Z INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2017-10-11T07:18:44Z INFO Harvester started for file: /var/log/apache2/access.log
2017-10-11T07:19:14Z INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=1 filebeat.harvester.running=1 filebeat.harvester.started=1 libbe$
2017-10-11T07:19:44Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=2 libbeat.logstash.publish.read_bytes=24 libbeat.logs$
2
2017-10-11T07:29:44Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=4 libbeat.logstash.publish.read_bytes=24 libbeat.logs$
2017-10-11T07:30:14Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=4 libbeat.logstash.publish.read_bytes=24 libbeat.logs$
2017-10-11T07:30:44Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=6 libbeat.logst$
2017-10-11T07:31:14Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=6 libbeat.logst$
2017-10-11T07:31:44Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=6 libbeat.logst$
2017-10-11T07:32:06Z ERR Failed to publish events caused by: read tcp 10.546.34.111:5192->55.194.113.9:5043: i/o timeout
2017-10-11T07:32:06Z INFO Error publishing events (retrying): read tcp 10.546.34.111:5192->55.194.113.9:5043: i/o timeout
2017-10-11T07:32:14Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_errors=1 libbeat.logs$
2017-10-11T07:32:44Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=12 libbeat.logs$
2017-10-11T07:33:14Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=12 libbeat.logs$
2017-10-11T07:33:22Z ERR Failed to publish events caused by: read tcp 10.346.3s.111:5392->55.194.113.9:5043: i/o timeout
2017-10-11T07:33:22Z INFO Error publishing events (retrying): read tcp 10.346.3s.111:5392->55.194.113.9:5043: i/o timeout

Kindly please someone assist me on this

Which filebeat/logstash versions are you using?

The failure is in filebeat waiting for the ACK signal from Logstash. The default timeout (it's configurable) is 60 seconds.

Is logstash 'blogged/slowed down' by its outputs not being available or some grok pattern chocking on some new log message recently introduced?

How can I check that ? should i run logstash in some debug mode ? if yes then how ?

Check logstash logs for errors in outputs.

Also have a look at Logstash Monitoring docs.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.