Logstash not pulling data fast enough from Kafka

I have a huge problem. My kafka is on a different DC and we are using logstash to pull data. Our Elastic stack is running in kubernetes but our data is getting pulled very slow. How can I optimize logstash to pull data faster?

How do you know the bottleneck is logstash? How many logstash consumers do you have reading from the Kafka topic ? How many partitions does your Kafka topic have ? What does your Kafka input configuration look like at the logstash side ? What’s the spec of the the logstash consumer ?

1 Like

thanks for the help! let me answer those questions here:
How do you know the bottleneck is logstash?
I have installed redpanda on my cloud and I can see the lag on my kafka broker. So I see a huge lag, quite often and it takes days to clear.

How many logstash consumers do you have reading from the Kafka topic ?
I have 12 partitions, so I have 12 logstash input kafka configured to pull from kafka as consumers

What does your Kafka input configuration look like at the logstash side ?
producer is logstash output but I do see the messages in redpanda almost in realtime so this part is good.

What’s the spec of the the logstash consumer ?

input {
      kafka {
        bootstrap_servers => "my-server-kafka"
        topics => ["logstash"]
        codec => "json"
        group_id => "logstash"
        auto_offset_reset => "latest"
        session_timeout_ms => "250000"
        request_timeout_ms => "300000"
        security_protocol => "SSL"
        ssl_endpoint_identification_algorithm => ""
        ssl_keystore_location => "/usr/share/logstash/keystore/keystore"
        ssl_key_password => "Password"
        ssl_keystore_password => "Password"
        ssl_truststore_location => "/usr/share/logstash/keystore/truststore"
        ssl_truststore_password => "Password"
        fetch_max_wait_ms =>  "500"
        fetch_max_bytes => "96582912"
        fetch_min_bytes => "6048576"
        max_partition_fetch_bytes => "8048576"
        consumer_threads => "12"
        max_poll_records => "2000"
      }

I fixed this but in case anyone has any problems with logstash input as kafka consumer this is a great article:

and this is my config now:

input {
      kafka {
        bootstrap_servers => "my-server-kafka"
        topics => ["logstash"]
        codec => "json"
        group_id => "logstash"
        auto_offset_reset => "latest"
        session_timeout_ms => "250000"
        request_timeout_ms => "300000"
        security_protocol => "SSL"
        ssl_endpoint_identification_algorithm => ""
        ssl_keystore_location => "/usr/share/logstash/keystore/keystore"
        ssl_key_password => "Password"
        ssl_keystore_password => "Password"
        ssl_truststore_location => "/usr/share/logstash/keystore/truststore"
        ssl_truststore_password => "Password"
        fetch_max_wait_ms => 3000
        fetch_max_bytes => "96582912"
        fetch_min_bytes => "6048576"
        max_partition_fetch_bytes => "8048576"
        consumer_threads => "12"
        max_poll_records => "2000"
      }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.