Duplicated events using Filebeat

rschirin · May 29, 2017, 12:54pm

hi,
I'm using Filebeat to read a custom log file; every record (line) is sent to Logstash to normalize it, then into Elastic and then into Kibana to visualize it.
Unfortunately, I saw a lot of duplicated events; I know that could be a problem of ACK communication between FB and LS but I cannot find any specific ACK error in FB's logs.

Not every record is duplicated.
Not every record is duplicated the same time. one record could be published twice, another one 3 times (generally no more time).

very often I can find in the FB logs this frame:

2017-05-26T16:39:42+02:00 DBG  handle error: read tcp 172.21.20.71:53429->10.0.18.131:5045: wsarecv: An existing connection was forcibly closed by the remote host.
2017-05-26T16:39:42+02:00 DBG  closing
2017-05-26T16:39:42+02:00 DBG  0 events out of 143 events sent to logstash. Continue sending
2017-05-26T16:39:42+02:00 DBG  close connection
2017-05-26T16:39:42+02:00 ERR Failed to publish events caused by: read tcp 172.21.20.71:53429->10.0.18.131:5045: wsarecv: An existing connection was forcibly closed by the remote host.
2017-05-26T16:39:42+02:00 INFO Error publishing events (retrying): read tcp 172.21.20.71:53429->10.0.18.131:5045: wsarecv: An existing connection was forcibly closed by the remote host.

can you help me?

tudor · May 29, 2017, 1:41pm

Is there any firewall or router between Filebeat that could terminate connections occasionally?

rschirin · May 29, 2017, 2:58pm

Filebeat and Logstash are not on the same vlan but let me say a stupid thing: taking in consideration a firewall issue, it should create duplicated records everytime, isn't it?

tudor · May 29, 2017, 3:18pm

Generally Filebeat creates duplicates when sending the events is successful, but the ACK for the batch doesn't make it back. My suspicion is that the router closes down the connections on inactivity, that's why not all docs are duplicated.

rschirin · May 29, 2017, 3:25pm

could I work on any parameter to avoid this closure?

I found also a lot of

E:\ExtractorCat\outFile_20170526_4.roger. Closing because close_inactive of 1m0s reached.

and

Flushing spooler because of timeout. Events flushed:90
Flushing spooler because of timeout. Events flushed:0

tudor · May 30, 2017, 9:25am

Sounds like you have close_inactive set. Review the docs for it and increase the value if you that.

The "Flushing spooler" message is at debug level, I think, so you can just increase the logging level.

rschirin · May 30, 2017, 1:43pm

I will tune it but I guess that this is not the cause of my issure, right?

tudor · May 31, 2017, 7:34am

Yeah, it has nothing to do with the duplicate events. the An existing connection was forcibly closed by the remote host. message indicates that either Logstash is restarting or there's a connectivity issue. Can you double check that LS is not restarting every so often?

rschirin · May 31, 2017, 9:54am

so, I'm trying to use this solution [SOLVED] TCP reset

I've also noted that I was using congestion_threshold => 20 value that is now deprecated. so should I dismiss it, right?

rschirin · May 31, 2017, 5:07pm

Could be possibile I've individuated the issue.
I found a lot of lines as follow:

Remove state for file as file removed: E:\Scripts\Admin\ExtractorCat\outFile_20170531_2.roger
...
...
State removed for E:\Scripts\Admin\ExtractorCat\outFile_20170531_2.roger because of older: 0s

I write every record into a specific file (i.e. outFile_20170531_2.roger). When I have to add records I rename the file into *.rogerTMP to avoid waste of time since FileBeat locks it.
When Filebeat exec a scan, sometimes it cannot find those files so it removes the saved state of original file.

could be it the cause of my issue?

rschirin · June 6, 2017, 11:09pm

after several days I can say that this issue is not fixed on my side. so I would like to ask you how do you think should be better to feed a file that will be parsed and inspected by Filebeat?

steffens · June 7, 2017, 10:14am

Which versions of logstash, filebeat and logtash-input-beats plugin are you using. I'd recommend to update logstash to the most recent version.

rschirin · June 8, 2017, 2:32pm

Filebeat 5.2.1
Logstash 5.2
Elasticsearch 5.0.0

steffens · June 8, 2017, 3:02pm

Logstash 5.2 ships with an old logstash-input-beats plugin. The plugins changelog suggest some fixes regarding LS closing connections. Unless you've update the plugin to the most recent version, you can try to update the plugin or Logstash itself.

system · July 6, 2017, 3:02pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat: duplicate events Beats	8	5655	February 23, 2017
Logstash not sending ACK to Filebeat thereby causing duplicate events Logstash	3	1098	March 13, 2020
Filebeat read tcp i/o timeout which is causing duplicate event to get stored in elasticsearch Beats filebeat	2	1450	March 12, 2020
[SOLVED]Getting duplicate entries with Filebeat to Logstash Setup Beats filebeat	7	9711	December 5, 2018
Duplicate documents using Filebeat Beats	3	2411	July 15, 2016

Duplicated events using Filebeat

Related topics