I Want to remove the duplicate events inside Logstash filter how could I do that? I mention the events below please have a look and suggest

You can use a fingerprint filter with the concatenate_all_fields option set to true. If you are sending events to elasticsearch then use the fingerprint as the document_id and duplicate events will be overwritten.

If you really want to do the de-duplication in logstash (because you are not writing to elasticsearch) then you would need to use a ruby filter that builds a cache of recently seen fingerprints. You would look for the fingerprint in the cache and event.cancel if it is found, or add it to the cache if not. If you have multiple worker threads then you will need to synchronize access to the cache, and you will need to implement a cache purge strategy. Decidely non-trivial.

1 Like