Metricbeat docker module appears to be leaking file descriptor when metricset cpu, diskio, memory, or network is enabled

jmcclure · February 19, 2021, 12:04am

Environment

Metricbeat oss: 7.10.2
Elasticsearch oss: 7.10.2
Docker Engine: 18.09.0
Docker API: 1.39 (minimum version 1.12)
OS: CentOS 7.6.1810

Hi,

My deployment is on Linux with Docker, where Elasticsearch runs as a container along with other application specific containers and Metricbeat is installed from the tar archive directly onto the host.

I am experiencing a file descriptor leak when an application container gets into a bad state and is pending the completion of a ‘docker stop’ command. When this happens, for each metricbeat docker module cycle the cpu, diskio, memory, and network metricsets will leak one file descriptor each (a total of 4). The FD count correlates to the one container in a bad state and those four metricsets enabled. The other metricsets (container, info, and healthcheck) do not exhibit the FD leak issue.

When the FD leak is observed, metricbeat can be stopped and the leaking stops occurring. Also, the leaking stops when the container is killed (via HUP) and metricbeat is left running. But this leak scenario only shows up when metricbeat is in play in the deployment. I am not seeing any warning/error in the metricbeat log during the leaking; I only see fewer events published to Elasticsearch (i.e. missing events from the bad state container).

At this point, I cannot say this is a metricbeat issue for sure. Did try numerous combinations of versions, etc. In fact this was first spotted on the 6.7.1 version of the Elastic stack. I would like for someone to take a look at this. I’ve just run out of ideas. For the time being we have disabled the docker module to avoid running out of FDs on the host.

I’ve been able to simulate getting a generic container into this bad state and can reproduce the behavior consistently. To reproduce the behavior, I set up the following environment:

Elasticsearch setup:

docker run --name es -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2-amd64

Test container setup:

docker run -d --name test --rm -it python:alpine3.13 python -c "while True: print(1);"

Metricbeat setup:

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-oss-7.10.2-linux-x86_64.tar.gz
tar xzvf metricbeat-oss-7.10.2-linux-x86_64.tar.gz

Metricbeat configuration (placed in the extract path):

metricbeat.yml

    ---
    name: metricbeat

    metricbeat.config.modules:
      enabled: true
      path: ./module-*.yml
      reload.enabled: true
      reload.period: 30s

    output.elasticsearch:
      enabled: true
      username: "elastic"
      password: "changeme"
      hosts: [ "http://localhost:9200" ]

    logging.level: debug
    logging.to_files: true
    logging.files:
      path: ./
      name: metricbeat.log
      rotateeverybytes: 104857600
      interval: 24h
      keepfiles: 7
      permissions: 0644
    logging.metrics:
      enabled: true
      period: 60s

    setup.template.enabled: true
    setup.template.settings:
      index.number_of_shards: 1
      index.auto_expand_replicas: "0-1"

module-docker.yml

    ---
    - module: docker
      metricsets:
        #- container
        #- info
        - cpu
        - diskio
        - memory
        - network
        #- healthcheck
      cpu.cores: false
      enabled: true
      hosts: [ "unix:///var/run/docker.sock" ]
      period: 10s

Metricbeat start:

./metricbeat

After all of this is started up, I am also starting a script in an endless loop to check on various aspects of file descriptors. Particularly watching dockerd and docker.sock counts. Leaked sockets appear as ‘* 0’ using the ss command on CentOS:

u_str  ESTAB      0      0      /var/run/docker.sock 4832522               * 0                     users:(("dockerd",pid=284967,fd=36)) ---

check_fd.sh

    #!/bin/bash
    PRIOR_ABANDON_SOCKET_CNT=0
    PRIOR_DOC_PROC_FD_CNT=0
    PRIOR_DOC_LSOF_FD_CNT=0
    PRIOR_DOC_LSOF_SOCKET_CNT=0
    PRIOR_DOC_GO_ROUTINE_CNT=0
    PRIOR_BEAT_PROC_FD_CNT=0
    PRIOR_BEAT_LSOF_FD_CNT=0

    while true; do
      echo ""
      echo "################################################################"
      echo ""
      echo "$(ss -a -p -e | grep docker.sock | grep -v LISTEN)"
      echo ""

      CURRENT_ABANDON_SOCKET_CNT=$(ss -a -p -e | grep docker.sock | grep -v LISTEN | grep '* 0' | wc -l)
      echo "Abandoned socket count=$CURRENT_ABANDON_SOCKET_CNT  diff=$(expr $CURRENT_ABANDON_SOCKET_CNT - $PRIOR_ABANDON_SOCKET_CNT)"
      PRIOR_ABANDON_SOCKET_CNT=$CURRENT_ABANDON_SOCKET_CNT

      echo ""
      echo "################################################################"
      echo ""

      CURRENT_DOC_PROC_FD_CNT=$(ls -la /proc/$(pidof dockerd)/fd | wc -l)
      echo "dockerd proc fd count=$CURRENT_DOC_PROC_FD_CNT  diff=$(expr $CURRENT_DOC_PROC_FD_CNT - $PRIOR_DOC_PROC_FD_CNT)"
      PRIOR_DOC_PROC_FD_CNT=$CURRENT_DOC_PROC_FD_CNT

      CURRENT_DOC_LSOF_FD_CNT=$(lsof -p `pidof dockerd` | wc -l)
      echo "dockerd lsof fd count=$CURRENT_DOC_LSOF_FD_CNT  diff=$(expr $CURRENT_DOC_LSOF_FD_CNT - $PRIOR_DOC_LSOF_FD_CNT)"
      PRIOR_DOC_LSOF_FD_CNT=$CURRENT_DOC_LSOF_FD_CNT

      CURRENT_DOC_LSOF_SOCKET_CNT=$(lsof -p `pidof dockerd` | grep '/var/run/docker.sock' | wc -l)
      echo "dockerd lsof docker.sock only count=$CURRENT_DOC_LSOF_SOCKET_CNT  diff=$(expr $CURRENT_DOC_LSOF_SOCKET_CNT - $PRIOR_DOC_LSOF_SOCKET_CNT)"
      PRIOR_DOC_LSOF_SOCKET_CNT=$CURRENT_DOC_LSOF_SOCKET_CNT

      CURRENT_DOC_GO_ROUTINE_CNT=$(curl -s --unix-socket /var/run/docker.sock http://./debug/pprof/goroutine\?debug\=2 | grep goroutine | wc -l)
      echo "dockerd go routine count=$CURRENT_DOC_GO_ROUTINE_CNT  diff=$(expr $CURRENT_DOC_GO_ROUTINE_CNT - $PRIOR_DOC_GO_ROUTINE_CNT)"
      PRIOR_DOC_GO_ROUTINE_CNT=$CURRENT_DOC_GO_ROUTINE_CNT
      echo ""

      if pgrep -x "metricbeat" > /dev/null; then
        CURRENT_BEAT_PROC_FD_CNT=$(ls -la /proc/$(pidof metricbeat)/fd | wc -l)
        echo "metricbeat proc fd count=$CURRENT_BEAT_PROC_FD_CNT  diff=$(expr $CURRENT_BEAT_PROC_FD_CNT - $PRIOR_BEAT_PROC_FD_CNT)"
        PRIOR_BEAT_PROC_FD_CNT=$CURRENT_BEAT_PROC_FD_CNT

        CURRENT_BEAT_LSOF_FD_CNT=$(ls -la /proc/$(pidof metricbeat)/fd | wc -l)
        echo "metricbeat lsof fd count=$CURRENT_BEAT_LSOF_FD_CNT  diff=$(expr $CURRENT_BEAT_LSOF_FD_CNT - $PRIOR_BEAT_LSOF_FD_CNT)"
        PRIOR_BEAT_LSOF_FD_CNT=$CURRENT_BEAT_LSOF_FD_CNT
      else
        echo "metricbeat is not running"
      fi

      echo ""
      sleep 5
    done

Set test container to bad state:

To get the container into a bad state. I am issuing 100 docker exec commands, which start a long running ping command from the shell inside of the container and then detach with the process still running.

# for i in {1..100}; do ./ping_detach.sh ; done

ping_detach.sh

    #!/bin/sh
    CONTAINER_NAME=test

    RAW_ID=$(curl --silent --unix-socket /var/run/docker.sock "http://./containers/${CONTAINER_NAME}/exec" \
      -H "Content-Type: application/json" \
      -d '{
        "AttachStdout": true,
        "Tty": true,
        "Cmd": [ "sh", "-c", "ping -i 0.1 -c 100000 -s 1024 <real IP address>"]
      }'
    )

    ID=$(echo $RAW_ID | python -c 'import json,sys;obj=json.load(sys.stdin);print obj["Id"]')

    curl --silent --unix-socket /var/run/docker.sock "http://./exec/${ID}/start" \
      -H "Content-Type: application/json" \
      -d '{
        "Detach": true,
        "Tty": true
      }'

After the detach script has completed, I attempt to stop the test container by docker command:

# docker stop test

At this point the container does not stop immediately and the command appears hung. In the meantime, the leaking FDs start to show up and continue until either stopping metricbeat or sending a HUP signal to the container’s moby shim process.

Additional info:

There is a reasonable limit set on file descriptors.

    # ulimit -aHS
    core file size          (blocks, -c) 0
    data seg size           (kbytes, -d) unlimited
    scheduling priority             (-e) 0
    file size               (blocks, -f) unlimited
    pending signals                 (-i) 1030582
    max locked memory       (kbytes, -l) 64
    max memory size         (kbytes, -m) unlimited
    open files                      (-n) 65536
    pipe size            (512 bytes, -p) 8
    POSIX message queues     (bytes, -q) 819200
    real-time priority              (-r) 0
    stack size              (kbytes, -s) 8192
    cpu time               (seconds, -t) unlimited
    max user processes              (-u) 257666
    virtual memory          (kbytes, -v) unlimited
    file locks                      (-x) unlimited

In addition to running metricbeat from the setup above, I also spent some time running metricbeat in debug mode using delve and attaching to IntelliJ via remote go session.

go build -i -gcflags 'all=-N -l'
dlv --listen=:5005 --headless=true --api-version=2 --accept-multiclient exec ./metricbeat

An error is caught in file metricbeat/docker/docker.go line 150 (function exportContainerStats) when the container id matches the bad state container. The err variable holds this information:

    err = error | github.com/docker/docker/errdefs.errUnknown
      error = github.com/docker/docker/client.errConnectionFailed
        host = unix:///var/run/docker.sock

while the containerStats variable holds:

    Body = nil
    OSType = “”

The only other container, Elasticsearch, continues to provide successful stats in the same function during this time.

I am not familiar enough with the code nor the docker library that is being called to know if the underlying socket connection is properly closed or if the container should not be probed for stats when it has a specific state. I believe the lower level call to docker client.ContainerStats is actually timing out with no response; approx. 5 seconds.

Any help is appreciated.

Thanks,
james

jsoriano · February 19, 2021, 11:59am

Hi @jmcclure,

Thanks for your detailed report!

I remember a leak happening with the docker module and config reload. Every time the docker module was reloaded, it leaked these four file descriptors. I stopped investigating this because I couldn't reproduce it anymore since Metricbeat 7.8, but I didn't find what "fixed" this, so maybe the underlying cause is still there.

You mention that to reproduce this issue, you run many exec commands. Take into account that this will be generating many requests to the docker API and could make it unresponsive, this could cause connections to be kept longer than usual. This would be consequent with what you mention about the docker stop command apparently hung.

Could you try to reproduce this issue with reload.enabled: false? This would help to see if the issue happens only when the config is reloaded, or if this also happens with normal operations of the module.

It'd be also nice to see if you could manage to reproduce this leak without the loop of docker execs, that can be clogging the docker API.

jmcclure · February 19, 2021, 1:02pm

Hi @jsoriano

The configuration in metricbeat.yml was changed to disable the reload as you mentioned.

reload.enabled: false

Got everything restarted, and I can still reproduce the issue.

I will attempt to find another way (other than flooding the Docker API) to reproduce this as you suggest.

james

jsoriano · February 19, 2021, 3:57pm

@jmcclure something you can try as a workaround is to set a timeout in the configuration for the docker module, like timeout: 10s. This should close stalled connections after this time.

jmcclure · February 19, 2021, 5:20pm

@jsoriano added the timeout configuration as you suggested to the docker module. Tried both 10s and 5s values to fit within my cycle time period: 10s. Did not see any change in behavior.

jmcclure · February 22, 2021, 1:50pm

I am still working to reproduce the issue without the loop of docker execs as you suggested @jsoriano

Have not managed that yet

james

jsoriano · February 22, 2021, 2:59pm

Ok, I think that this shouldn't be an issue unless the docker daemon is misbehaving, and for these cases the timeout option should work to avoid keeping connections for long.

Could you share the configuration with the timeout? I wonder why it doesn't help

jmcclure · February 22, 2021, 4:47pm

Below is the configuration I have been using with the disabled reload and module timeout set:

---
name: metricbeat

metricbeat.config.modules:
  enabled: true
  path: ./module-*.yml
  reload.enabled: false

output.elasticsearch:
  enabled: true
  username: "elastic"
  password: "changeme"
  hosts: [ "http://localhost:9200" ]

logging.level: debug
logging.to_files: true
logging.files:
  path: ./
  name: metricbeat.log
  rotateeverybytes: 104857600
  interval: 24h
  keepfiles: 7
  permissions: 0644
logging.metrics:
  enalbled: true
  period: 60s

setup.template.enabled: true
setup.template.settings:
  index.number_of_shards: 1
  index.auto_expand_replicas: "0-1"

---
- module: docker
  metricsets:
    #- container
    #- info
    #- cpu
    #- diskio
    - memory
    #- network
    #- healthcheck
  cpu.cores: false
  enabled: true
  hosts: [ "unix:///var/run/docker.sock" ]
  period: 10s
  timeout: 5s

jmcclure · February 22, 2021, 4:55pm

May be onto something promising. Upgraded Docker to 20.10.3 this morning, which required a uninstall of the existing Docker and an install of these packages on CentOS 7:

docker-ce-cli-20.10.3-3.el7.x86_64
containerd.io-1.4.3-3.1.el7.x86_64
docker-ce-20.10.3-3.el7.x86_64
docker-ce-rootless-extras-20.10.3-3.el7.x86_64

# docker version
Client: Docker Engine - Community
 Version:           20.10.3
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        48d30b5
 Built:             Fri Jan 29 14:34:14 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      trueServer: Docker Engine - Community
 Engine:
  Version:          20.10.3
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       46229ca
  Built:            Fri Jan 29 14:32:37 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.3
  GitCommit:        269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc:
  Version:          1.0.0-rc92
  GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Everything was set up as before, and testing is not showing any file descriptor leaks with metricbeat running. Will do some more verification with this configuration and then downgrade back to Docker 18.09.0 to verify that the only change was Docker and nothing else in the environment.

James

jmcclure · February 24, 2021, 10:57pm

Verified that downgrading Docker (18.09.0) and retesting -- the file descriptor leaks came back.

A Docker upgrade to 20.10.3 solves this issue.

james

system · March 25, 2021, 12:57am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
MetricBeat v7.3.1 memory leak Beats metricbeat	9	1277	December 10, 2019
Metricbeat Docker Module seems to break when used whithin Autodiscover context Beats beats-module , metricbeat	1	358	August 21, 2019
[Metricbeat System] Containers overview Dashboard issues Beats metricbeat	12	2389	December 27, 2017
CPU/Memory consumption after upgrading to Version 5.1.2 Beats metricbeat	9	1307	February 28, 2017
Is metricbeat preventing a container from being deleted? Some locking issue? Beats metricbeat	4	1514	August 1, 2017

Metricbeat docker module appears to be leaking file descriptor when metricset cpu, diskio, memory, or network is enabled

Related topics