I have the following case.
1 filebeat sending events to kafka then 1 logstash reading from kafka and sending to elasticsearch
What is happenning is a massive duplication of events in some random periodicity in Elasticsearch.
I have validated and the files are being correctly put into kafka by the filebeat, the offset in kafka is also correct but sometimes the logstash duplicates events.
I tried to use console consumer i only get 1 of each event.
I havel also tried to adjust heartbeat_timeout_ms and i got less duplication but i would like to cut all the duplication
The versions I'm using are:
filebeat 5.0.0
kafka 0.9
logstash 5.5.1 with logstash-input-kafka (4.2.0)
the logstash input configuration I'm using is the following:
input {
kafka {
topics => ["xxxxxxx"]
group_id => "logstash-xxxxxx"
bootstrap_servers => "xxxxxxx:6667,xxxxxxxxx:6667"
consumer_threads => 4
heartbeat_interval_ms => "500"
auto_commit_interval_ms => "100"
codec => "json"
}
}
Can someone help me ,
Thanks