Filebeat sending duplicate logs to logstash

rajshekhar · February 9, 2018, 10:31am

Hi Team,

We are testing filebeat for our development setup, using filebeat 5.6.2. The logs are sent to logstash hosted by logz.io via SSL, but we observed that we are getting some duplicate logs entries in elasticsearch index, for around same time we also get following error in Filebeat logs:

2018-02-09T03:13:42Z ERR Failed to publish events caused by: EOF
2018-02-09T03:13:42Z INFO Error publishing events (retrying): EOF
2018-02-09T04:13:52Z ERR Failed to publish events caused by: EOF
2018-02-09T04:13:52Z INFO Error publishing events (retrying): EOF
2018-02-09T05:13:42Z ERR Failed to publish events caused by: EOF
2018-02-09T05:13:42Z INFO Error publishing events (retrying): EOF
2018-02-09T06:14:02Z ERR Failed to publish events caused by: EOF
2018-02-09T06:14:02Z INFO Error publishing events (retrying): EOF
2018-02-09T07:13:43Z ERR Failed to publish events caused by: EOF
2018-02-09T07:13:43Z INFO Error publishing events (retrying): EOF
2018-02-09T08:13:43Z ERR Failed to publish events caused by: EOF
2018-02-09T08:13:43Z INFO Error publishing events (retrying): EOF
2018-02-09T09:13:43Z ERR Failed to publish events caused by: EOF
2018-02-09T09:13:43Z INFO Error publishing events (retrying): EOF

We have tried following to solve this issue :

Checked if the log entry was generated twice, we didn't find any duplicate entries in source log
Checked for any network related issue causing drop of acknowledgement from logstash and filebeat resending logs.
As suggested on logstash community we checked for client_inactivity_timeout value of logstash - it's 5 minutes. So doesn't seem that logstash is closing the tcp connection after inactivity.
logz.io team has checked if they are getting any errors in logstash, but they aren't getting any related to connection timeout.

Can you please help in resolving this issue ?

Thanks

pierhugues · February 9, 2018, 3:11pm

Hell @rajshekhar, Filebeat support at least once semantic, so duplicates can append if there are network errors. FB splits his communication into batch of events to reduce round trip, if an error happens FB will retransmit the whole batch and could create duplicates.

Also, usually when back pressure happen, Logstash will send a keep alive (ACK 0) signal to FB to tell them to stop sending new events and wait.

In some version of the logstash beats input, that logic was wrong and was making client disconnect more and could result into more duplicates.

I don't know which version of the logstash beats input that logz.io are using.

system · March 2, 2018, 10:32am

This topic was automatically closed after 21 days. New replies are no longer allowed.

Topic		Replies	Views
Filebeat read tcp i/o timeout which is causing duplicate event to get stored in elasticsearch Beats filebeat	2	1450	March 12, 2020
ERR - Failed to publish events caused by: EOF Beats filebeat	4	2018	December 8, 2016
Filebeat: sync.go:85: ERR Failed to publish events caused by: EOF Beats	3	2019	September 23, 2016
Failed to publish events caused by: EOF Beats filebeat	6	5315	November 19, 2016
Filebeat ERR Failed to publish events caused by: EOF Beats filebeat	8	4541	February 7, 2017

Filebeat sending duplicate logs to logstash

Related topics