Understand "reconnect" and "retries" settings in Kafka output

Travis · May 5, 2021, 9:25am

Hello all,

With this config :

input {
  tcp {
    type => "NETWORK_DEVICE"
    port => 1602
  }

filter{
}

output {

kafka {
topic_id => ["network.device"]
codec => json
bootstrap_servers => "kafka1:9092,kafka2:9092,kafka3:9092"
ssl_truststore_location => "/etc/logstash/kafka.server.truststore.jks"
ssl_truststore_password => "itdepends"
ssl_truststore_type => "JKS"
security_protocol => "SSL"
}
}

I noticed that if kakfa nodes are down, logstash show this message every x ms :

[...]
[2021-05-05T04:33:36,893][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 2 (kafka2/192.168.2.230:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:36,919][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 3 (kafka3/192.168.2.248:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:36,945][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 2 (kafka2/192.168.2.230:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:36,997][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 2 (kafka2/192.168.2.230:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:37,021][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 3 (kafka3/192.168.2.248:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:37,048][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 2 (kafka2/192.168.2.230:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:37,099][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 2 (kafka2/192.168.2.230:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:37,122][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 3 (kafka3/192.168.2.248:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:37,150][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 2 (kafka2/192.168.2.230:9092) could not be established. Broker may not be available.
[2021-05-05T04:33:37,202][WARN ][org.apache.kafka.clients.NetworkClient][network_device] [Producer clientId=producer-1] Connection to node 2 (kafka2/192.168.2.230:9092) could not be established. Broker may not be available.
[...]

When looking at the documentation, I see three settings :

reconnect_backoff_ms
retries
retry_backoff_ms

I would like to understand these settings :

"reconnect_backoff_ms" : does this setting is related to the errors messages I put above ? Does it mean : "do not try to reconnect to kafka until x time ?"

"retries" : we speak about which retries ?

"retry_backoff_ms" : what is the difference with "retries" ?

Thanks for your help !

Wolfram_Haussig · May 5, 2021, 9:40am

Hello Travis,

Those settings map to Kafka settings: Kafka producer configuration reference | Confluent Documentation

Maybe this description of the reconnect.backoff.max.ms parameter helps you understand the back off logic:

...the backoff per host will increase exponentially for each consecutive connection failure...

If you set the backoff to 1000ms the first retry would occur after 1 second, the second after 2 seconds, the third after 4 seconds and so on.

This is also true for the retry backoff. While the backoff parameter defines the time between retries the retries parameter defines the maximum count of retries.

Retries in Kafka are described as follows:

Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error.

Best regards
Wolfram

Travis · May 5, 2021, 9:52am

Hello Wolfram,

Thansk for this feedback

I'm not sure to understand because, when you look at the logs above, reconnect happen evey 50 ms and never increase exponentially :

I we take only kakfa 2 :

I speak about reconnect_backoff setting Kafka output plugin | Logstash Reference [7.12] | Elastic

Or maybe Logstash setting is just static and not exponential ?

system · June 2, 2021, 9:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kafka Output - Node Failover Logstash	1	519	June 11, 2019
Reliable Kafka Output Plugin config Logstash	2	727	June 22, 2017
Logstash-kafka plugin Logstash	3	773	July 6, 2017
Kafka output plugin loses all data when Kafka is down Logstash	2	784	October 30, 2017
How to restart or kill logstash process when kafka output plugin failed Logstash	1	1168	February 18, 2018

Understand "reconnect" and "retries" settings in Kafka output

Related topics