Seeing this on a machine that runs short lived containers every few minutes, seems to run out of file descriptors due to filebeat hanging onto connections to the Docker daemon socket? File descriptor leak?
Takes a day or two to get to that point, noticed the file handles count increasing steadily throughout.
root@system:/var/log/filebeat# lsof -n 2>&1 | grep -i filebeat | grep "type=STREAM" | wc -l
73512
root@system:/var/log/filebeat# lsof -n 2>&1 | grep -i filebeat | grep "type=STREAM" | tail -n 5
filebeat 949 32596 root 1019u unix 0xffff8817d69eb400 0t0 52295907 type=STREAM
filebeat 949 32596 root 1020u unix 0xffff88180019f800 0t0 52291899 type=STREAM
filebeat 949 32596 root 1021u unix 0xffff880bfbb6f000 0t0 52291020 type=STREAM
filebeat 949 32596 root 1022u unix 0xffff880b64fd4800 0t0 52265676 type=STREAM
filebeat 949 32596 root 1023u unix 0xffff880b51642800 0t0 52287359 type=STREAM
Gets to the point where it can't even rotate out its own logfile at /var/log/filebeat/filebeat
...
Latest log entry:
root@system:/var/log/filebeat# tail -n 10 filebeat.1
2019-10-28T07:37:05.434-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.434-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.435-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.435-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.436-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.436-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.437-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.437-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.437-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
2019-10-28T07:37:05.438-0400 ERROR [autodiscover] cfgfile/list.go:96 Error creating runner from config: error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.22/containers/json?limit=0: dial unix /var/run/docker.sock: socket: too many open files
Tested on 6.8.3 + 7.4.1 so far. Config snippet on 6.8.3:
filebeat.autodiscover:
providers:
- type: docker
cleanup_timeout: 30s
templates:
- condition:
regexp:
docker.container.name: ".*"
config:
- type: docker
containers.ids:
- "${data.docker.container.id}"
processors:
- add_docker_metadata: ~
- decode_json_fields:
fields: ["message"]
target: "message_json"
.....
output:
redis:
hosts: ["ourredisserver"]
port: 6379
key: "logstash-somesuffix"
datatype: "list"
timeout: 5
reconnect_interval: 1