Reliable Kafka Output Plugin config


#1

Hello,
I'm wondering about a way how to make output to Kafka reliable. I am new to Kafka, so I'm sorry about any logical errors.

I use Logstash 5.4.0 and Kafka 0.10.2.1 with a matching Zookeeper. The Kafka setup is rather simple with a single instance and single broker.

My basic idea is to parse events from pseudo-log files on file system and send them to Kafka in a reliable manner.

I thought about network issues, Kafka being down, etc. So I added retries etc. to my Kafka output plugin config. The basic test scenario involving stdin/stdout was:

  • fire up both Kafka and Logstash
  • Logstash emits events based on stdin input
  • read events from a Kafka topic with a simple stdout consumer
  • stop Kafka
  • keep emitting events in Logstash - they should be retrying, right?
  • start Kafka again
  • read events from start of the Kafka topic - retried should be present, right?

After I start the Kafka no new events are shown. Thus it seems to me like the retry policy doesn't take effect.

Pipeline setup:

 input { stdin { } }
 output {
    kafka {
       codec => plain {
          format => "%{message}"
       }
       topic_id => "test"
       client_id => "test_console_client"
       bootstrap_servers => "kafka:9092"
       acks => "all"
       retries => 999 
       retry_backoff_ms => 10000
       block_on_buffer_full => true
   } 
}

I've tried to examine the pipeline's internals through _node/stats/pipeline. I got equal no. of "in" and "out" events for the Kafka plugin even when the Kafka was down.

Is my setup wrong? Is the testing idea flawed?

Thanks in advance!


#2

The cause is a setting called max.block.ms. The default is set to 60 000ms. After this timeout the message is dropped.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.