Trouble running metricbeat:5.3.0 with docker module

Hi,

I've got metricbeat:5.3.0 happily pumping host data (system module) directly to elasticsearch:5.3.0 within a docker swarm (swarm mode), but I'm having difficulty getting the metricbeat docker module to work (I have dockbeat:latest happily pushing data too).

When I run the config below I see records in kabana:5.3.0 from the metricbeat docker module like this:

{
  "_index": "metricbeat-2017.04.12",
  "_type": "metricsets",
  "_id": "AVtjJnaoXM4f8Yh2ACrG",
  "_score": null,
  "_source": {
    "@timestamp": "2017-04-12T17:11:56.783Z",
    "beat": {
      "hostname": "4d21d0c403c4",
      "name": "4d21d0c403c4",
      "version": "5.3.0"
    },
    "docker": {
      "container": {}
    },
    "error": "Get http://unix.sock/containers/json?: dial unix /var/run/docker.sock: connect: permission denied",
    "metricset": {
      "host": "/var/run/docker.sock",
      "module": "docker",
      "name": "container",
      "rtt": 849
    },
    "type": "metricsets"
  },
  "fields": {
    "@timestamp": [
      1492017116783
    ]
  },
  "highlight": {
    "metricset.module": [
      "@kibana-highlighted-field@docker@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1492017116783
  ]
} 

...and in the metricbeat log:

2017/04/12 17:14:46.673063 metrics.go:39: INFO Non-zero metrics in the last 30s: fetches.docker-container.events=3 fetches.docker-container.failures=3 fetches.docker-cpu.events=3 fetches.docker-cpu.failures=3 fetches.docker-diskio.events=3 fetches.docker-diskio.failures=3 fetches.docker-healthcheck.events=3 fetches.docker-healthcheck.failures=3 fetches.docker-info.events=3 fetches.docker-info.failures=3 fetches.docker-memory.events=3 fetches.docker-memory.failures=3 fetches.docker-network.events=3 fetches.docker-network.failures=3 fetches.system-cpu.events=3 fetches.system-cpu.success=3 fetches.system-filesystem.events=249 fetches.system-filesystem.success=3 fetches.system-memory.events=3 fetches.system-memory.success=3 fetches.system-network.events=9 fetches.system-network.success=3 fetches.system-process.events=615 fetches.system-process.success=3 libbeat.es.call_count.PublishEvents=22 libbeat.es.publish.read_bytes=10525 libbeat.es.publish.write_bytes=840459 libbeat.es.published_and_acked_events=900 libbeat.publisher.messages_in_worker_queues=900 libbeat.publisher.published_events=900

here's the snippet from my docker-compose.yml:

  metricbeat:
    image: docker.elastic.co/beats/metricbeat:5.3.0
    volumes:
      - /proc:/hostfs/proc:ro
      - /sys/fs/cgroup:/hostfs/sys/fs/cgroup:ro
      - /:/hostfs:ro
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - logging
    deploy:
      mode: global
    secrets:
      - metricbeat_yml
    command: metricbeat -e -c /run/secrets/metricbeat_yml -system.hostfs=/hostfs

secrets:
  metricbeat_yml:
    file: metricbeat.yml

and my metricbeat.yml:

metricbeat.modules:
- module: system
  metricsets:
    - cpu
    - filesystem
    - memory
    - network
    - process
  enabled: true
  period: 10s
  processes: ['.*']
  cpu_ticks: false
- module: docker
  metricsets:
    - container
    - cpu 
    - diskio
    - healthcheck
    - info
    - memory
    - network
  hosts: ["unix:///var/run/docker.sock"]
  enabled: true
  period: 10s

output.elasticsearch:
  hosts: ["elasticsearch:9200"]

I know this module is in beta, but when I look at the documentation page the example config is commented out which seems kind of strange to me: https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-docker.html

I think I must be doing something stupid...

It's a permission error. What are the file permissions on /var/run/docker.sock and who is the user that is trying to access the socket? One of those two things needs to change. For example, adding user: root to the docker-compose.yml for the metricbeat container will change the user that metricbeat runs as.

Probably the docs should make a recommendation on this.

Thanks @andrewkroh, that worked :slight_smile:

From metricbeat's behaviour when it starts up (changing ownership of the config file) I imagine that this is as a result of that initial hoop jumping. It doesn't feel like the right thing to bypass that and I'd be uncomfortable suggesting that in the module's documentation.

Another option would be to customize the container by adding the beats user that metricbeat runs as to the docker group. But I'm guessing this isn't portable since the docker GID could differ across hosts.

@jarpy Any ideas or recommendations w.r.t. docker socket permissions?

I got the same issue with the official image.

How exactly should I modify docker-compose.yml?
Here is my current one:

metricbeat:
    image: docker.elastic.co/beats/metricbeat:5.2.1
    environment:
      # prevent logspout/logstash sending this containers logs
      LOGSPOUT: ignore
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /data/metricbeat.yml:/usr/share/metricbeat/metricbeat.yml:ro
$ docker-compose exec metricbeat  ls -lah  /var/run/docker.sock
srw-rw---- 1 root 999 0 Feb  6 01:15 /var/run/docker.sock

Because this is happening with the official image, is there an issue tracking this on github? (as many people search on github, too)

I'm having exactly the same problem.
Running the docker container as root isn't an option.
I'm thinking about the possibility to:

  1. provide a separate entrypoint.sh
  2. read out the group (the guid) of /var/run/docker.sock
  3. add a new group with the mentioned guid
  4. add user metricbeat to this group

I haven't implement this until now. Just thinking about it

This is probably the best option without running Metricbeat as root, but you're absolutely right about the GID. On my system, the docker group is GID 999, so I can create a custom image with:

FROM docker.elastic.co/beats/metricbeat:5.3.1
USER root
RUN addgroup docker --gid 999 && \
    usermod --append --group docker metricbeat
USER metricbeat

That magic number is a problem, though.

I also tried making the Docker socket world-readable, but that wasn't sufficient. I think that the library used by Metricbeat to do HTTP over Unix sockets is trying to open the socket read/write, even though you would expect read-only to be sufficient for Metricbeat's purposes.

Here's an arguably more elegant solution. Grant explicit access to the Metricbeat user with a filesystem ACL.

setfacl -m u:1000:rw /var/run/docker.sock

This assumes that your system supports filesystem ACLs, but if it's modern enough to run Docker, then it probably does. Also note that on many systems, permissions for the Docker socket are managed by systemd, and won't persist after a reboot etc. If you end up using this technique, you'd probably want to modify the systemd unit for Docker to set the ACL.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.