We are using Logstash with Kafka input and ElasticSearch output, and noticed a strange behavior when ES is slow to respond (for example, when we are pushing multiple document updates).
- Kafka is constantly rebalancing the consumer group (which consists of 10 logstash instances, each with a different client_id but all share the same group_id)
- None of the logstash instances are committing their consumer offsets to Kafka
This leads to logstash constantly replaying the same events from Kafka without actually progressing at all, and lag that keeps increasing infinitely (because consumer offset isn't committed).
When changing the output to STDOUT (for example) without making any changes to the kafka input configuration, the issue goes away, so it is definitely related to the slowness on ES side - but I don't understand why logstash isn't committing any offsets to Kafka, regardless of whether ES is fast or not.
This is what my configuration currently looks like - has anyone encountered similar behaviour before?
input {
kafka {
codec => "json"
topics => ["Metrics"]
group_id => "metric_group"
poll_timeout_ms => "1000"
enable_auto_commit => true
auto_commit_interval_ms => "100"
heartbeat_interval_ms => "1000"
decorate_events => true
bootstrap_servers =>
"kafka-pre-001.local:9092,kafka-pre-002.local:9092,kafka-pre-003.local:9092"
}
}
filter {
}
output {
#stdout { codec => rubydebug }
elasticsearch {
hosts => [
"es-pre-001.local:9092",
"es-pre-002.local:9092"
]
document_type => "doc"
document_id => "%{doc_id}"
index => "%{doc_target}-%{+YYYY.MM}"
doc_as_upsert => true
action => update
retry_max_interval => 5
retry_on_conflict => 5
timeout => 1000000
}
}