As far as I am aware, events in logstash are completely separate from each other.
There are two options I can think of though.
- Remove the duplicates from the CSV file prior to sending it into Logstash. I assume this isn't possible in your case.
- Use a custom document_id so that duplicate events are overwritten.
Basically when you insert a document into Elasticsearch it creates it's own document_id. Think of it as a primary key in a database. If you want you can set your own document ID in the Elasticsearch Logstash output. Now if a duplicate event comes in, it will overwrite and update the existing event instead of creating a new one. IN your case all of the data will be the same, but it will stop the duplicate. Your CSV file just has to have some kind of unique identifier.