Duplicated events using Filebeat

I'm using Filebeat to read a custom log file; every record (line) is sent to Logstash to normalize it, then into Elastic and then into Kibana to visualize it.
Unfortunately, I saw a lot of duplicated events; I know that could be a problem of ACK communication between FB and LS but I cannot find any specific ACK error in FB's logs.

Not every record is duplicated.
Not every record is duplicated the same time. one record could be published twice, another one 3 times (generally no more time).

very often I can find in the FB logs this frame:

2017-05-26T16:39:42+02:00 DBG  handle error: read tcp> wsarecv: An existing connection was forcibly closed by the remote host.
2017-05-26T16:39:42+02:00 DBG  closing
2017-05-26T16:39:42+02:00 DBG  0 events out of 143 events sent to logstash. Continue sending
2017-05-26T16:39:42+02:00 DBG  close connection
2017-05-26T16:39:42+02:00 ERR Failed to publish events caused by: read tcp> wsarecv: An existing connection was forcibly closed by the remote host.
2017-05-26T16:39:42+02:00 INFO Error publishing events (retrying): read tcp> wsarecv: An existing connection was forcibly closed by the remote host.

can you help me?

Is there any firewall or router between Filebeat that could terminate connections occasionally?

Filebeat and Logstash are not on the same vlan but let me say a stupid thing: taking in consideration a firewall issue, it should create duplicated records everytime, isn't it?

Generally Filebeat creates duplicates when sending the events is successful, but the ACK for the batch doesn't make it back. My suspicion is that the router closes down the connections on inactivity, that's why not all docs are duplicated.

could I work on any parameter to avoid this closure?

I found also a lot of

E:\ExtractorCat\outFile_20170526_4.roger. Closing because close_inactive of 1m0s reached.


Flushing spooler because of timeout. Events flushed:90
Flushing spooler because of timeout. Events flushed:0

Sounds like you have close_inactive set. Review the docs for it and increase the value if you that.

The "Flushing spooler" message is at debug level, I think, so you can just increase the logging level.

I will tune it but I guess that this is not the cause of my issure, right?

Yeah, it has nothing to do with the duplicate events. the An existing connection was forcibly closed by the remote host. message indicates that either Logstash is restarting or there's a connectivity issue. Can you double check that LS is not restarting every so often?

so, I'm trying to use this solution [SOLVED] TCP reset

I've also noted that I was using congestion_threshold => 20 value that is now deprecated. so should I dismiss it, right?

Could be possibile I've individuated the issue.
I found a lot of lines as follow:

Remove state for file as file removed: E:\Scripts\Admin\ExtractorCat\outFile_20170531_2.roger
State removed for E:\Scripts\Admin\ExtractorCat\outFile_20170531_2.roger because of older: 0s

I write every record into a specific file (i.e. outFile_20170531_2.roger). When I have to add records I rename the file into *.rogerTMP to avoid waste of time since FileBeat locks it.
When Filebeat exec a scan, sometimes it cannot find those files so it removes the saved state of original file.

could be it the cause of my issue?

after several days I can say that this issue is not fixed on my side. so I would like to ask you how do you think should be better to feed a file that will be parsed and inspected by Filebeat?

Which versions of logstash, filebeat and logtash-input-beats plugin are you using. I'd recommend to update logstash to the most recent version.

Filebeat 5.2.1
Logstash 5.2
Elasticsearch 5.0.0

Logstash 5.2 ships with an old logstash-input-beats plugin. The plugins changelog suggest some fixes regarding LS closing connections. Unless you've update the plugin to the most recent version, you can try to update the plugin or Logstash itself.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.