Filebeat duplicate logs

Hello everyone,

for prevent the duplication data that can be received from Filebeat I used this Logstash filtration

fingerprint {
source => "message"
target => "[@metadata][fingerprint]"
method => "SHA1"
key => "key"
base64encode => true
}

but if appears duplicate inside log file this filtration also prevent

  • I need to prevent duplicate data that can happen from some cases like filebeat restart but if log file there is in it data duplication i want to permit this

thank you

Are you using the fingerprint as document ID in your Elasticsearch output? Are you indexing into time-based indices based on rollover?

yes, I don't have problems in the configurations Logstash pipeline but the problem in how to prevent the duplicate data if case happens like filebeat restart without prevent the duplication if happens from the logs

I do not understand what you mean. Can you please elaborate?

some cases can cause the data duplication like filebeat restarting for control in this case I used fingerprint filtration but this approach prevent everything of duplication even if the log file there is duplicate of data.
I want to make if the duplication from filebeat or everything it is prevent but if it from log file i want to permit

Filebeat should be able to handle restarts without duplicating a lot of data. What type of storage are you reading from? How is your Filebeat configured?

In your example you are calculating a fingerprint based on the contents of the log line. Identical log lines in log files will therefore result in the same fingerprint and cause updates in Elasticsearch. You could add the filename to the string you use to determine the fingerprint and this would allow the same log line from different files to be inserted without resulting in updates.

if in the same log file there is duplications, what is the solution?

Filebeat can add/adds the offset for each log line so you could include this when calculating the fingerprint. I do not believe the Logstash file input plugin is able to do this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.