Filebeat: duplicate events


#1

I have a very simple elkstack POC environment using Filebeat > logstash > elasticsearch > kibana
All have been updated to 5.2 previously used 5.0
None of the elk stack components are clustered

If I remove filebeat and use a file as input into logstash, the number of events created is as expected.
However, when I use filebeat as an input (and the same file being ingested), I get over 10% more events.

Having trawled through the output, these are duplicate events created by filebeat that do not exist as duplicated in the input file.

Is this an issue with filebeat? are there any suggested work arounds for this?


(ruflin) #2

Can you check if you get transmission errors in the filebeat log? Are there connection errors between FB and LS?


#3

I have a couple of errors on in filebeat

2017-02-03T08:30:33Z ERR Failed to publish events caused by: read tcp : i/o timeout
2017-02-03T08:30:33Z INFO Error publishing events (retrying): read tcp : i/o timeout

No errors in logstash... just a couple of
[2017-02-03T08:30:33,694][WARN ][logstash.filters.grok ] Timeout executing grok against field 'messa
ge' with value 'Value too large to output (592 bytes)!


(Steffen Siering) #4

Hmmm... The error message in beats is beats waiting for ACK from logstash. The default timeout is 30s. If you increase the timeout in beats to maybe 5 minutes, does it improve the situation.

The grok timeout is interesting too. Maybe too long/inefficient grok is clogging the pipeline in logstash. You may want to check if you can optimize your grok pattern or maybe investigate the dissect filter.


(Saurabh) #5

Hi Ruflin

I have a filebeat configuration which reads a file (1000 logs) and sends it to logstash and which in turn sends it to elastic search.
So elasticsearch has 1000 documents added to a particuar index.
Now if 5 more lines get added to the log files , filebeat sends 1005 logs instead of sending only last 5 which were added newly.
Could you please advise if you are aware of any option which i need to change to ingest only newly added logs.


(Steffen Siering) #6

Reason you get 1005 events is, filebeat is not receiving the ACK from logstash. without ACK filebeat doesn't know if the lines have been processed or not. That is, it has to send everything again.


(Saurabh) #7

No in some cases it send the whole file again but in some cases only a newly added records. Is there anyway we can configure it so that continuously monitor the file and whatever gets added to the file it sends it to logstash? Any help would be highly appreciated.


(Steffen Siering) #8

filebeat is continuously monitoring a file and only sends lines not being ACKed by the outputs yet. Reason you get the original 1000 lines is, the lines have not been ACked by the output (due to timeout when waiting for ACK)... That is, even though the events have not been ACKed by logstash, logstash might still have received and processed these events.

which filebeat, logstash and logstash-input-beats plugin version are you using?


(system) #9

This topic was automatically closed after 21 days. New replies are no longer allowed.