Hello,
I'm wondering about a way how to make output to Kafka reliable. I am new to Kafka, so I'm sorry about any logical errors.
I use Logstash 5.4.0 and Kafka 0.10.2.1 with a matching Zookeeper. The Kafka setup is rather simple with a single instance and single broker.
My basic idea is to parse events from pseudo-log files on file system and send them to Kafka in a reliable manner.
I thought about network issues, Kafka being down, etc. So I added retries etc. to my Kafka output plugin config. The basic test scenario involving stdin/stdout was:
- fire up both Kafka and Logstash
- Logstash emits events based on stdin input
- read events from a Kafka topic with a simple stdout consumer
- stop Kafka
- keep emitting events in Logstash - they should be retrying, right?
- start Kafka again
- read events from start of the Kafka topic - retried should be present, right?
After I start the Kafka no new events are shown. Thus it seems to me like the retry policy doesn't take effect.
Pipeline setup:
input { stdin { } }
output {
kafka {
codec => plain {
format => "%{message}"
}
topic_id => "test"
client_id => "test_console_client"
bootstrap_servers => "kafka:9092"
acks => "all"
retries => 999
retry_backoff_ms => 10000
block_on_buffer_full => true
}
}
I've tried to examine the pipeline's internals through _node/stats/pipeline
. I got equal no. of "in"
and "out"
events for the Kafka plugin even when the Kafka was down.
Is my setup wrong? Is the testing idea flawed?
Thanks in advance!