How to avoid duplicates before entering the filter plugin?

There are cases where it can be done upstream. For example, a jdbc input might be configured with 'SELECT DISTINCT' which would eliminate duplicates.

In the filters, if you are writing to elasticsearch you might be able to add an elasticsearch filter to query the existence of a document before processing it. But that is not cheap and may not be an optimization. You would need to benchmark both with and without.