Metricbeat ERR Failed to publish events caused by: read tcp 127.0.0.1:56820->127.0.0.1:5044: i/o timeout

ritchierich · May 11, 2017, 2:13am

Getting the errors below from metricbeat to logstash.

Versions:
logstash 5.4.0
metricbeat version 5.4.0 (amd64), libbeat 5.4.0

Metricbeat Log:

2017-05-11T00:57:41Z ERR Failed to publish events caused by: read tcp 127.0.0.1:43188->127.0.0.1:5044: i/o timeout
2017-05-11T00:58:33Z ERR Failed to publish events caused by: read tcp 127.0.0.1:46240->127.0.0.1:5044: i/o timeout
2017-05-11T00:59:08Z ERR Failed to publish events caused by: read tcp 127.0.0.1:49648->127.0.0.1:5044: i/o timeout
2017-05-11T01:00:00Z ERR Failed to publish events caused by: read tcp 127.0.0.1:51550->127.0.0.1:5044: i/o timeout
2017-05-11T01:00:53Z ERR Failed to publish events caused by: read tcp 127.0.0.1:54062->127.0.0.1:5044: i/o timeout
2017-05-11T01:01:28Z ERR Failed to publish events caused by: read tcp 127.0.0.1:56820->127.0.0.1:5044: i/o timeout

Netstat details:

netstat -an | grep 5044
tcp        0      0 0.0.0.0:5044                0.0.0.0:*                   LISTEN      
tcp        0      0 127.0.0.1:46818             127.0.0.1:5044              ESTABLISHED 
tcp        0      0 127.0.0.1:5044              127.0.0.1:46818             ESTABLISHED 
tcp        0      0 172.19.13.145:45044         34.202.71.250:3000          ESTABLISHED 
tcp        0      0 172.19.13.145:3000          52.52.206.128:50442         ESTABLISHED 
tcp        0      0 172.19.13.145:50444         54.67.1.143:3000            ESTABLISHED

Metricbeat Config:
metricbeat.modules:

- module: system
  metricsets:
    - cpu
    - load
    - core
    - diskio
    - filesystem
    - fsstat
    - memory
    - network
    - process
  enabled: true
  period: 10s
  processes: ['.*']

name: prod_aerospike-v3_10_0_3mm-euwest1-01b

output.logstash:
  hosts: ["localhost:5044"]

Logstash Input:
input { beats { port => 5044 tags => ["metricbeat"] } }

Thanks,
Rich

exekias · May 11, 2017, 8:07am

Hi @ritchierich,

Could you please check logstash logs? Please dump here anything you find

Best regards

steffens · May 11, 2017, 10:35am

More complete logs with debugging logs enabled -d '*' will help us seeing when metricbeat did start publishing the events and when the i/o timeout was triggered.

Is Logstash stuck? The error happens when metricbeat is waiting for an ACK or keep-alive signal from Logstash. Normally logstash will send a keep-alive signal every few seconds, resetting the timer in metricbeat. Maybe you can get a pcap with tcpdump so we can see if communication takes place properly.

What happens if you increase the timeout from default 30s to e.g. 1h. Is Logstash processing events? Can metricbeat send more events?

ritchierich · May 11, 2017, 2:43pm

@exekias @steffens

Here's some more details about my Elastic Stack configuration. I haven't had the opportunity to put logging into debug mode or change the timeout. Did try adding Djava.net.preferIPv4Stack=true in jvm.options but no luck. Below is logstash-json.log from the same server.

Elastic Stack configuration:

Same configuration running in multiple AWS regions
Logstash - All regions are processing application logs
Metricbeat - Only one regions is expiring metricbeat timeouts

ELK classic pipeline: app Logs/metricbeat => logstash => redis

  {"level":"WARN","loggerName":"logstash.agent","timeMillis":1494465032949,"thread":"LogStash::Runner","logEvent":{"message":"stopping pipeline","id":"main"}}
  {"level":"INFO","loggerName":"logstash.pipeline","timeMillis":1494465047832,"thread":"[main]-pipeline-manager","logEvent":{"message":"Starting pipeline","id":"main","pipeline.workers":16,"pipeline.batch.size":125,"pipeline.batch.delay":5,"pipeline.max_inflight":2000}}
  {"level":"INFO","loggerName":"logstash.inputs.beats","timeMillis":1494465048597,"thread":"[main]-pipeline-manager","logEvent":{"message":"Beats inputs: Starting input listener","address":"0.0.0.0:5044"}}
  {"level":"INFO","loggerName":"logstash.pipeline","timeMillis":1494465048631,"thread":"[main]-pipeline-manager","logEvent":{"message":"Pipeline main started"}}
  {"level":"INFO","loggerName":"logstash.agent","timeMillis":1494465048678,"thread":"Api Webserver","logEvent":{"message":"Successfully started Logstash API endpoint","port":9600}}

steffens · May 11, 2017, 10:11pm

You can configure metricbeat to push to redis directly. Why do you need to run metricbeat and logstash on same host?

The logstash logs only do contain startup information. Have you got some debug logs as well?

Have you collected any debug logs from metricbeat?

In case you want/need to logstash, can you try logstash with null or stdout output only (no redis output) and see it's processing any data?

ritchierich · May 13, 2017, 4:49am

@steffens @exekias

Thanks for the help and suggestions! The metricbeat i/o timeouts threw me off, I'm in the last phase of migrating from ES 1.7.1 to Elastic Stack 5. Part of my strategy is to do apples to apples which is why I'm doing metricbeats to logstash.

But after diving into it further the following changes solved the problem.

Metricbeat: Reduced metricbeats period from 10 to 60
Logstash Shipper: Redis output added batch and batch_events
Logstash Indexer: Elasticsearch output added flush_size

Logstash Shipper:
output { redis { host => {{ elastic_stack.shipper_redis.hosts }} shuffle_hosts => true data_type => "list" batch => "true" batch_events => "250" key => "logstash" } }

Logstash Indexer:
output { elasticsearch { flush_size => 4000 hosts => {{ elastic_stack.indexer_redis.hosts }} } }

Cheers,
Rich

system · June 10, 2017, 4:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ERR Failed to publish events caused by: read tcp 172.16.1.2:46410->172.16.0.254:5044: i/o timeout Beats metricbeat	6	1503	July 25, 2018
ERR Failed to publish events caused by: read tcp X.XXXX:55860->XXXXXXX:5044: i/o timeout Logstash	8	2514	August 30, 2018
Error on filebeat -ERR Failed to publish events caused by: read tcp Beats filebeat	6	7772	July 17, 2017
Error publishing metricbeat to logstash Beats metricbeat	6	984	December 11, 2017
Failed to publish events caused by: i/o timeout while sending logs to logstash Logstash	1	302	March 11, 2020

Metricbeat ERR Failed to publish events caused by: read tcp 127.0.0.1:56820->127.0.0.1:5044: i/o timeout

Related topics