Load balancing with ERR : Failed to publish events caused by: write tcp

I got the error message from filebeat.

2017-02-27T16:12:07+09:00 ERR Failed to publish events caused by: write tcp clientIP:44036->HostIP1:5044: write: connection reset by peer
2017-02-27T16:12:07+09:00 INFO Error publishing events (retrying): write tcp ClientIP:44036->HostIP2:5044: write: connection reset by peer

I have two logstash and two filebeat that it reads many files and will be more. I added load balancing configuration forward logstashes and the error is often occurred. If there are many filebeats, it will seem like deadlock.

My two filebeat.yml is

output.logstash:
      hosts: ["host1:5044", "host2:5044" ]
      loadbalance: true
      index: filebeat
      bulk_max_size: 1024
1 Like

logstash version 5.2.0
beat vresion 5.2.0
os centos

Is logstash running? anything in logstash logs? The connection is closed by logstash host while filebeat is trying to send events.

which logstash-input-beats plugin version is shipped with your logstash. Can you try to upgrade/downgrade to version 3.1.10.

Did you try to increase the client_inactivity_timeout in logstash?

Thanks reply. :smile:
logstash does not have any logs. only these error has occurred in the filebeats.
I used same versions. I added the client_inactivity_timeout = 3000 in logstash and there are no errors.

Q1.
But, I wonder why the connection between logstash host and filebeat client is closed. After tcp socket handshaking, do the logstash think that it may be finished and stop connection if there are no data?

Q2.
I have more filebeats than 20 and may get big client_inactivity_timeout. It leads to have many idle client. Is it not affected for the logstash resource usage like cpu or memory?

which logstash-input-beats plugin version have you installed? In load-balancing case it might take somewhat longer for all logstash instances getting events from one filebeat, as with your config filebeat operates in lock-step... e.g. if one logstash instance blocks, the other will get no events.

Not sure, but I think there has been a bug regarding client_inactivity_timeout, also closing active clients. I've read users upgrading/downgrading to 3.1.10 did solve the issue.

If logstash closed a connect, filebeat will reconnect and continue sending events. No events will be lost.

But, I wonder why the connection between logstash host and filebeat client is closed. After tcp socket handshaking, do the logstash think that it may be finished and stop connection if there are no data?

Beats only connect if they're going to send data. After having received the ACK, a timer might close the connection if beats is not sending any new events in configured time window. Normally the idle timer in logstash is inactive if a batch is being received. Maybe there's a bug with the timing when the timer is active/inactive WHILE receiving a batch of events (if you're sending big multiline events, consider to decrease bulk_max_size to reduce event decoding time).

I have more filebeats than 20 and may get big client_inactivity_timeout. It leads to have many idle client. Is it not affected for the logstash resource usage like cpu or memory?

client_activity_timeout is per client, not having send any events. If logstash is processing (or enqueued) a batch of events from a beat, the idle-timer is inactive. If a client is idle, it will not take any resources but a file descriptor... idle == no data send == no CPU/memory used in logstash... with number of file-descriptors being limited by OS, new clients might not be able to connect if you are out of file-descriptors. But 20 clients doesn't sound like much.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.