Logstash TCP Output Plugin Reconnect Interval Issue

Hi,

I'm having a problem with the logstash tcp output plugin, the reconnect_interval does not seem to have an affect on the plugin's actual reconnect attempts? I may be misunderstanding what the field is used for exactly but no matter what I set it to I see the following pattern of reconnect attempts if the TCP server I'm trying to connect to is down:

Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 0.02
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 0.04
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 0.08
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 0.16
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 0.32
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 0.64
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 1.28
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 2.0
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 2.0
Failed (Connection refused - connect(2) for "10.0.20.158" port 13370). Sleeping for 2.0

Here is an example of the output plugin's config:

 output {
      stdout { codec => rubydebug }
      tcp {
        host => "${TCP_OUTPUT_HOST}"
        port => "${TCP_OUTPUT_PORT}"
        reconnect_interval => "15"
        mode => "client"
        codec => "plain
      }
  }

That is confusing! What you are seeing is the stud retry handling in the connect method.

The sleep reconnect_interval only happens once connect throws an exception. It will sleep for the reconnect_interval then go back into connect and do the retries with exponential backoff again.

I would expect the backoff to go down to 0.02 again, not stay at 2.0 :thinking:

1 Like

Thanks for your reply! :+1: Well spotted. That's interesting, I've left it running for long periods and it seems to stay on 2.0s indefinitely, and doesn't seem to throw an exception at all. Not sure if that's expected/unexpected behaviour here

Sorry to be harping on about this but I'm just wondering if this would be considered a bug, or a documentation error at least? The docs on the plugin implies that if a connection fails, the plugin will retry based on that field but in my case here, which I'd imagine is somewhat common, the reconnect_interval is seemingly never taken into account?

Looks like there is an open issue in the TCP Output Plugin repo on this already:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.