How to avoid duplicates before entering the filter plugin?

Badger · May 28, 2020, 4:12pm

There are cases where it can be done upstream. For example, a jdbc input might be configured with 'SELECT DISTINCT' which would eliminate duplicates.

In the filters, if you are writing to elasticsearch you might be able to add an elasticsearch filter to query the existence of a document before processing it. But that is not cheap and may not be an optimization. You would need to benchmark both with and without.

Topic		Replies	Views
ES query to check the existence of a document_id? Logstash	10	983	June 26, 2020
How to stop duplicate entries using elasticsearch plugin Logstash	10	6102	June 29, 2017
How not to overwrite duplicates? save old documents Logstash	3	814	July 23, 2020
Removing Duplicate documents in ElasticSearch Elasticsearch	2	362	June 11, 2019
Avoid duplication Logstash	13	4822	December 7, 2018

How to avoid duplicates before entering the filter plugin?

Related topics