I am using logstash pipeline data from elasticsearch to kafka.
I have an index(asd_logstash-2018.07.15) with around 1000 docs, but upon checking in kafka it went around 22k messages and continuously increasing.
I came across this blog https://www.elastic.co/blog/logstash-lessons-handling-duplicates, but I don't know how to pass ID in kafka.
Here is my config.
input{
elasticsearch{
hosts => "192.168.0.254:9200"
index => "asd_logstash-2018.07.15"
size => 1000
scroll => "5m"
docinfo => true
}
}
output{
kafka{
codec => json
topic_id => "sampletest"
bootstrap_servers => "192.168.0.64:9092"
}
}