Environment
- Metricbeat oss: 7.10.2
- Elasticsearch oss: 7.10.2
- Docker Engine: 18.09.0
- Docker API: 1.39 (minimum version 1.12)
- OS: CentOS 7.6.1810
Hi,
My deployment is on Linux with Docker, where Elasticsearch runs as a container along with other application specific containers and Metricbeat is installed from the tar archive directly onto the host.
I am experiencing a file descriptor leak when an application container gets into a bad state and is pending the completion of a ‘docker stop’ command. When this happens, for each metricbeat docker module cycle the cpu, diskio, memory, and network metricsets will leak one file descriptor each (a total of 4). The FD count correlates to the one container in a bad state and those four metricsets enabled. The other metricsets (container, info, and healthcheck) do not exhibit the FD leak issue.
When the FD leak is observed, metricbeat can be stopped and the leaking stops occurring. Also, the leaking stops when the container is killed (via HUP) and metricbeat is left running. But this leak scenario only shows up when metricbeat is in play in the deployment. I am not seeing any warning/error in the metricbeat log during the leaking; I only see fewer events published to Elasticsearch (i.e. missing events from the bad state container).
At this point, I cannot say this is a metricbeat issue for sure. Did try numerous combinations of versions, etc. In fact this was first spotted on the 6.7.1 version of the Elastic stack. I would like for someone to take a look at this. I’ve just run out of ideas. For the time being we have disabled the docker module to avoid running out of FDs on the host.
I’ve been able to simulate getting a generic container into this bad state and can reproduce the behavior consistently. To reproduce the behavior, I set up the following environment:
Elasticsearch setup:
docker run --name es -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2-amd64
Test container setup:
docker run -d --name test --rm -it python:alpine3.13 python -c "while True: print(1);"
Metricbeat setup:
curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-oss-7.10.2-linux-x86_64.tar.gz
tar xzvf metricbeat-oss-7.10.2-linux-x86_64.tar.gz
Metricbeat configuration (placed in the extract path):
metricbeat.yml
---
name: metricbeat
metricbeat.config.modules:
enabled: true
path: ./module-*.yml
reload.enabled: true
reload.period: 30s
output.elasticsearch:
enabled: true
username: "elastic"
password: "changeme"
hosts: [ "http://localhost:9200" ]
logging.level: debug
logging.to_files: true
logging.files:
path: ./
name: metricbeat.log
rotateeverybytes: 104857600
interval: 24h
keepfiles: 7
permissions: 0644
logging.metrics:
enabled: true
period: 60s
setup.template.enabled: true
setup.template.settings:
index.number_of_shards: 1
index.auto_expand_replicas: "0-1"
module-docker.yml
---
- module: docker
metricsets:
#- container
#- info
- cpu
- diskio
- memory
- network
#- healthcheck
cpu.cores: false
enabled: true
hosts: [ "unix:///var/run/docker.sock" ]
period: 10s
Metricbeat start:
./metricbeat
After all of this is started up, I am also starting a script in an endless loop to check on various aspects of file descriptors. Particularly watching dockerd and docker.sock counts. Leaked sockets appear as ‘* 0’ using the ss command on CentOS:
u_str ESTAB 0 0 /var/run/docker.sock 4832522 * 0 users:(("dockerd",pid=284967,fd=36)) ---
check_fd.sh
#!/bin/bash
PRIOR_ABANDON_SOCKET_CNT=0
PRIOR_DOC_PROC_FD_CNT=0
PRIOR_DOC_LSOF_FD_CNT=0
PRIOR_DOC_LSOF_SOCKET_CNT=0
PRIOR_DOC_GO_ROUTINE_CNT=0
PRIOR_BEAT_PROC_FD_CNT=0
PRIOR_BEAT_LSOF_FD_CNT=0
while true; do
echo ""
echo "################################################################"
echo ""
echo "$(ss -a -p -e | grep docker.sock | grep -v LISTEN)"
echo ""
CURRENT_ABANDON_SOCKET_CNT=$(ss -a -p -e | grep docker.sock | grep -v LISTEN | grep '* 0' | wc -l)
echo "Abandoned socket count=$CURRENT_ABANDON_SOCKET_CNT diff=$(expr $CURRENT_ABANDON_SOCKET_CNT - $PRIOR_ABANDON_SOCKET_CNT)"
PRIOR_ABANDON_SOCKET_CNT=$CURRENT_ABANDON_SOCKET_CNT
echo ""
echo "################################################################"
echo ""
CURRENT_DOC_PROC_FD_CNT=$(ls -la /proc/$(pidof dockerd)/fd | wc -l)
echo "dockerd proc fd count=$CURRENT_DOC_PROC_FD_CNT diff=$(expr $CURRENT_DOC_PROC_FD_CNT - $PRIOR_DOC_PROC_FD_CNT)"
PRIOR_DOC_PROC_FD_CNT=$CURRENT_DOC_PROC_FD_CNT
CURRENT_DOC_LSOF_FD_CNT=$(lsof -p `pidof dockerd` | wc -l)
echo "dockerd lsof fd count=$CURRENT_DOC_LSOF_FD_CNT diff=$(expr $CURRENT_DOC_LSOF_FD_CNT - $PRIOR_DOC_LSOF_FD_CNT)"
PRIOR_DOC_LSOF_FD_CNT=$CURRENT_DOC_LSOF_FD_CNT
CURRENT_DOC_LSOF_SOCKET_CNT=$(lsof -p `pidof dockerd` | grep '/var/run/docker.sock' | wc -l)
echo "dockerd lsof docker.sock only count=$CURRENT_DOC_LSOF_SOCKET_CNT diff=$(expr $CURRENT_DOC_LSOF_SOCKET_CNT - $PRIOR_DOC_LSOF_SOCKET_CNT)"
PRIOR_DOC_LSOF_SOCKET_CNT=$CURRENT_DOC_LSOF_SOCKET_CNT
CURRENT_DOC_GO_ROUTINE_CNT=$(curl -s --unix-socket /var/run/docker.sock http://./debug/pprof/goroutine\?debug\=2 | grep goroutine | wc -l)
echo "dockerd go routine count=$CURRENT_DOC_GO_ROUTINE_CNT diff=$(expr $CURRENT_DOC_GO_ROUTINE_CNT - $PRIOR_DOC_GO_ROUTINE_CNT)"
PRIOR_DOC_GO_ROUTINE_CNT=$CURRENT_DOC_GO_ROUTINE_CNT
echo ""
if pgrep -x "metricbeat" > /dev/null; then
CURRENT_BEAT_PROC_FD_CNT=$(ls -la /proc/$(pidof metricbeat)/fd | wc -l)
echo "metricbeat proc fd count=$CURRENT_BEAT_PROC_FD_CNT diff=$(expr $CURRENT_BEAT_PROC_FD_CNT - $PRIOR_BEAT_PROC_FD_CNT)"
PRIOR_BEAT_PROC_FD_CNT=$CURRENT_BEAT_PROC_FD_CNT
CURRENT_BEAT_LSOF_FD_CNT=$(ls -la /proc/$(pidof metricbeat)/fd | wc -l)
echo "metricbeat lsof fd count=$CURRENT_BEAT_LSOF_FD_CNT diff=$(expr $CURRENT_BEAT_LSOF_FD_CNT - $PRIOR_BEAT_LSOF_FD_CNT)"
PRIOR_BEAT_LSOF_FD_CNT=$CURRENT_BEAT_LSOF_FD_CNT
else
echo "metricbeat is not running"
fi
echo ""
sleep 5
done
Set test container to bad state:
To get the container into a bad state. I am issuing 100 docker exec commands, which start a long running ping command from the shell inside of the container and then detach with the process still running.
# for i in {1..100}; do ./ping_detach.sh ; done
ping_detach.sh
#!/bin/sh
CONTAINER_NAME=test
RAW_ID=$(curl --silent --unix-socket /var/run/docker.sock "http://./containers/${CONTAINER_NAME}/exec" \
-H "Content-Type: application/json" \
-d '{
"AttachStdout": true,
"Tty": true,
"Cmd": [ "sh", "-c", "ping -i 0.1 -c 100000 -s 1024 <real IP address>"]
}'
)
ID=$(echo $RAW_ID | python -c 'import json,sys;obj=json.load(sys.stdin);print obj["Id"]')
curl --silent --unix-socket /var/run/docker.sock "http://./exec/${ID}/start" \
-H "Content-Type: application/json" \
-d '{
"Detach": true,
"Tty": true
}'
After the detach script has completed, I attempt to stop the test container by docker command:
# docker stop test
At this point the container does not stop immediately and the command appears hung. In the meantime, the leaking FDs start to show up and continue until either stopping metricbeat or sending a HUP signal to the container’s moby shim process.
Additional info:
There is a reasonable limit set on file descriptors.
# ulimit -aHS
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1030582
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 257666
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
In addition to running metricbeat from the setup above, I also spent some time running metricbeat in debug mode using delve and attaching to IntelliJ via remote go session.
go build -i -gcflags 'all=-N -l'
dlv --listen=:5005 --headless=true --api-version=2 --accept-multiclient exec ./metricbeat
An error is caught in file metricbeat/docker/docker.go line 150 (function exportContainerStats) when the container id matches the bad state container. The err variable holds this information:
err = error | github.com/docker/docker/errdefs.errUnknown
error = github.com/docker/docker/client.errConnectionFailed
host = unix:///var/run/docker.sock
while the containerStats variable holds:
Body = nil
OSType = “”
The only other container, Elasticsearch, continues to provide successful stats in the same function during this time.
I am not familiar enough with the code nor the docker library that is being called to know if the underlying socket connection is properly closed or if the container should not be probed for stats when it has a specific state. I believe the lower level call to docker client.ContainerStats is actually timing out with no response; approx. 5 seconds.
Any help is appreciated.
Thanks,
james