How not to duplicate data in Elasticserch

Hello gentlemen.

I have a logstash pipeline that processes data from a .csv file

this file has a column called id

how do i not duplicate the data with the same id in case i reprocessed the file or another file that contains the same id?

follow my current filter

filter {
csv {
separator => ","
skip_header => "true"
columns => ["id","product","job description","uuid Value"]
}

You should use the document_id option in the elasticsearch output.

Something like this:

output {
    elasticsearch {
        hosts => ["hosts"]
        index => "index-name"
        document_id => "%{id}"
    }
}
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.