Failed to connect to backoff - Filebeat and ELK in different environemnts with NAT


(Daniela) #1

Hi,
I have two different environments: in "environment-a" there is ELK stack running using docker containers, and in "environment-b" there is a dockerised application that should sent logs using filebeat.
The environments use NAT system.

This is my filebeat config:

filebeat.inputs:
  - type: log
    paths:
      - /var/tmp/app_*.log

output.logstash:
        hosts: ["IP_ENV_A:5044"]
        ssl.verification_mode: none
        bulk_max_size: 1024

This is the logstash config:
input {
beats {
port => 5044
}
}
output {
stdout {
codec => rubydebug
}
}

This is the error that I get in filebeat logs (the port "48538" change every time the ERROR is generated):

ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://IP_ENV_A:5044)): read tcp 172.x.x.x:48538->IP_ENV_A:5044: read: connection reset by peer

I can telnet IP_ENV_A 5044.

What's wrong?


(Noémi Ványi) #2

Do you see any errors in the logs of the output?
Are you sure that Logstash is reachable from the container of Filebeat?


(Daniela) #3

The error that I read in filebeat logs is that one that I have already posted. Here below again:
ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://IP_ENV_A:5044)): read tcp 172.x.x.x:48538->IP_ENV_A:5044: read: connection reset by peer

How can I be sure that Logstash container is reachable from filebeat container?

I tried just to run a simple TCP server in Logstash container and a simple TCP client in filebeat container, and the communications work fine!
Telnet works fine too. So I still don't understand what the problem is.
Help please!


(Noémi Ványi) #4

Sorry if I was not clear. I mean if you see any error in the logs of Logstash.


(Daniela) #5

The directory /var/log/logstash is empty. So I don't have logs... Any help to understand the problem?

In Filebeat, sometimes I have got this error:
018-11-12T11:44:43.986Z ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://IP_ENV_A:5088)): EOF


(Steffen Siering) #6

The error messages indicate the connection is closed, not be filebeat, but by LS, firewall or NAT device (e.g. due to timeout).

read: connection reset by peer

This one indicates that filebeat did publish events, but the connection is closed while filebeat is waiting for the ACK from Logstash. This normally happens if Logstash is abnormally killed or if intermediate NAT devices/firewall just close connections due to internal timeouts. Normally Logstash sends a keep-alive signal the Filebeat while events are being processed, plus filebeat has an internal timeout if Logstash never responds or send the keep alive.