Docker module doesn't send healthcheck info

Hi,

I'm running metricbeat 6.2.2 using the official Docker image and have it configured to use the system and docker modules (see configuration below). I mapped the required volumes to /hostfs and my logstash is receiving messages from both modules. However, for the Docker module, the healthcheck metricset is missing. All other metric from the docker module are being sent without problems.

I checked the metricbeat logs with debug setting, and see that the [publish] entries contain the docker metricsets, except for the healthcheck. Looking at the logs (see below), the healthcheck module is started, it just doesn't seem to report anything. The logs do report an error, but I'm not sure if that's related:

|metricbeat_1  | 2018-03-01T08:46:09.837Z|ERROR|instance/metrics.go:69|Error while getting memory usage: error retrieving process stats|
|---|---|---|---|
|metricbeat_1  | 2018-03-01T08:46:09.840Z|ERROR|instance/metrics.go:113|Error retrieving CPU percentages: error retrieving process stats|

Any idea what I'm doing wrong here?

Thanks,
Thomas

My metricbeat.yml config:

metricbeat.modules:
- module: system
  metricsets: ["cpu", "fsstat", "load", "memory"]
  enabled: true
  period: 60s
  processes: ['.*']

- module: docker
  metricsets: ["container", "cpu", "diskio", "healthcheck", "memory"]
  hosts: ["unix:///hostfs/var/run/docker.sock"]
  period: 60s

output.elasticsearch:
  enabled: false

output.logstash:
  enabled: true
  hosts: ["xxxxxxxxxxxx:5044"]

The full log output:

metricbeat_1  | 2018-03-01T08:45:39.831Z	INFO	instance/beat.go:468	Home path: [/usr/share/metricbeat] Config path: [/usr/share/metricbeat] Data path: [/usr/share/metricbeat/data] Logs path: [/usr/share/metricbeat/logs]
metricbeat_1  | 2018-03-01T08:45:39.832Z	DEBUG	[beat]	instance/beat.go:495	Beat metadata path: /usr/share/metricbeat/data/meta.json
metricbeat_1  | 2018-03-01T08:45:39.832Z	INFO	instance/beat.go:475	Beat UUID: c64d7621-e962-487a-a1f4-1126b95351df
metricbeat_1  | 2018-03-01T08:45:39.832Z	INFO	instance/beat.go:213	Setup Beat: metricbeat; Version: 6.2.2
metricbeat_1  | 2018-03-01T08:45:39.832Z	DEBUG	[beat]	instance/beat.go:230	Initializing output plugins
metricbeat_1  | 2018-03-01T08:45:39.832Z	DEBUG	[processors]	processors/processor.go:49	Processors: 
metricbeat_1  | 2018-03-01T08:45:39.834Z	INFO	pipeline/module.go:76	Beat name: deployment-monitoring
metricbeat_1  | 2018-03-01T08:45:39.835Z	DEBUG	[processors]	processors/processor.go:49	Processors: 
metricbeat_1  | 2018-03-01T08:45:39.835Z	DEBUG	[processors]	processors/processor.go:49	Processors: 
metricbeat_1  | 2018-03-01T08:45:39.835Z	INFO	instance/beat.go:301	metricbeat start running.
metricbeat_1  | 2018-03-01T08:45:39.835Z	DEBUG	[module]	module/wrapper.go:100	Starting Wrapper[name=system, len(metricSetWrappers)=4]
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:100	Starting Wrapper[name=docker, len(metricSetWrappers)=5]
metricbeat_1  | 2018-03-01T08:45:39.836Z	INFO	[monitoring]	log/log.go:97	Starting metrics logging every 30s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	system/cpu will start after 2.614424912s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	system/fsstat will start after 6.380252131s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	system/load will start after 3.676958315s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	system/memory will start after 2.709925784s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	docker/container will start after 1.864830474s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	docker/cpu will start after 6.846147598s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	docker/diskio will start after 4.704152646s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	docker/healthcheck will start after 7.408250999s
metricbeat_1  | 2018-03-01T08:45:39.836Z	DEBUG	[module]	module/wrapper.go:146	docker/memory will start after 1.738964743s
metricbeat_1  | 2018-03-01T08:45:41.576Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=docker, name=memory, host=/hostfs/var/run/docker.sock]
metricbeat_1  | 2018-03-01T08:45:41.701Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=docker, name=container, host=/hostfs/var/run/docker.sock]
metricbeat_1  | 2018-03-01T08:45:42.451Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=system, name=cpu, host=]
metricbeat_1  | 2018-03-01T08:45:42.546Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=system, name=memory, host=]
metricbeat_1  | 2018-03-01T08:45:43.513Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=system, name=load, host=]
metricbeat_1  | 2018-03-01T08:45:44.541Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=docker, name=diskio, host=/hostfs/var/run/docker.sock]
metricbeat_1  | 2018-03-01T08:45:46.218Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=system, name=fsstat, host=]
metricbeat_1  | 2018-03-01T08:45:46.683Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=docker, name=cpu, host=/hostfs/var/run/docker.sock]
metricbeat_1  | 2018-03-01T08:45:47.245Z	DEBUG	[module]	module/wrapper.go:154	Starting metricSetWrapper[module=docker, name=healthcheck, host=/hostfs/var/run/docker.sock]
metricbeat_1  | 2018-03-01T08:46:09.837Z	ERROR	instance/metrics.go:69	Error while getting memory usage: error retrieving process stats
metricbeat_1  | 2018-03-01T08:46:09.840Z	ERROR	instance/metrics.go:113	Error retrieving CPU percentages: error retrieving process stats
metricbeat_1  | 2018-03-01T08:46:09.841Z	INFO	[monitoring]	log/log.go:124	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"info":{"ephemeral_id":"45993308-fe8b-4085-ab00-95d8c41f4233","uptime":{"ms":30030}},"memstats":{"gc_next":4488208,"memory_alloc":4107584,"memory_total":11721832}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":24,"batches":4,"total":24},"read":{"bytes":24},"type":"logstash","write":{"bytes":6550}},"pipeline":{"clients":2,"events":{"active":0,"published":24,"retry":7,"total":24},"queue":{"acked":24}}},"metricbeat":{"docker":{"container":{"events":5,"success":5},"cpu":{"events":5,"success":5},"diskio":{"events":5,"success":5},"memory":{"events":5,"success":5}},"system":{"cpu":{"events":1,"success":1},"fsstat":{"events":1,"success":1},"load":{"events":1,"success":1},"memory":{"events":1,"success":1}}},"system":{"cpu":{"cores":1},"load":{"1":0,"15":0.03,"5":0.04,"norm":{"1":0,"15":0.03,"5":0.04}}}}}}
1 Like

Any chance you could run metricbeat with only the healtcheck metricset enabled and enable debug logging and share it here?

Which version of Docker are you using?

The above errors should not be related as they are coming from the metrics collection.

Thanks for your reply!

The docker version is: Docker version 17.12.0-ce, build c97c6d6

The log when only running healthcheck is pasted below. It seems to collect metrics (the non zero metrics message), but I don't see any "publish" log messages (even though -d is set to "*").

metricbeat_1  | 2018-03-04T21:14:10.518Z        INFO    instance/beat.go:468      Home path: [/usr/share/metricbeat] Config path: [/usr/share/metricbeat] Data path: [/usr/share/metricbeat/data] Logs path: [/usr/share/metricbeat/logs]
metricbeat_1  | 2018-03-04T21:14:10.567Z        DEBUG   [beat]  instance/beat.go:495       Beat metadata path: /usr/share/metricbeat/data/meta.json
metricbeat_1  | 2018-03-04T21:14:10.568Z        INFO    instance/beat.go:475      Beat UUID: 15179563-d149-4e8d-b093-68f52ca181ec
metricbeat_1  | 2018-03-04T21:14:10.568Z        INFO    instance/beat.go:213      Setup Beat: metricbeat; Version: 6.2.2
metricbeat_1  | 2018-03-04T21:14:10.569Z        DEBUG   [beat]  instance/beat.go:230       Initializing output plugins
metricbeat_1  | 2018-03-04T21:14:10.569Z        DEBUG   [processors]    processors/processor.go:49 Processors:
metricbeat_1  | 2018-03-04T21:14:10.571Z        INFO    pipeline/module.go:76     Beat name: deployment-monitoring
metricbeat_1  | 2018-03-04T21:14:10.572Z        DEBUG   [modules]       beater/metricbeat.go:80    Register [ModuleFactory:[docker, mongodb, mysql, postgresql, system, uwsgi], MetricSetFactory:[aerospike/namespace, apache/status, ceph/cluster_disk, ceph/cluster_health, ceph/cluster_status, ceph/monitor_health, ceph/osd_df, ceph/osd_tree, ceph/pool_disk, couchbase/bucket, couchbase/cluster, couchbase/node, docker/container, docker/cpu, docker/diskio, docker/healthcheck, docker/image, docker/info, docker/memory, docker/network, dropwizard/collector, elasticsearch/node, elasticsearch/node_stats, etcd/leader, etcd/self, etcd/store, golang/expvar, golang/heap, graphite/server, haproxy/info, haproxy/stat, http/json, http/server, jolokia/jmx, kafka/consumergroup, kafka/partition, kibana/status, kubernetes/container, kubernetes/event, kubernetes/node, kubernetes/pod, kubernetes/state_container, kubernetes/state_deployment, kubernetes/state_node, kubernetes/state_pod, kubernetes/state_replicaset, kubernetes/system, kubernetes/volume, logstash/node, logstash/node_stats, memcached/stats, mongodb/collstats, mongodb/dbstats, mongodb/status, mysql/status, nginx/stubstatus, php_fpm/pool, postgresql/activity, postgresql/bgwriter, postgresql/database, prometheus/collector, prometheus/stats, rabbitmq/node, rabbitmq/queue, redis/info, redis/keyspace, system/core, system/cpu, system/diskio, system/filesystem, system/fsstat, system/load, system/memory, system/network, system/process, system/process_summary, system/raid, system/socket, system/uptime, uwsgi/status, vsphere/datastore, vsphere/host, vsphere/virtualmachine, zookeeper/mntr]]
metricbeat_1  | 2018-03-04T21:14:10.572Z        DEBUG   [processors]    processors/processor.go:49 Processors:
metricbeat_1  | 2018-03-04T21:14:10.573Z        INFO    instance/beat.go:301      metricbeat start running.
metricbeat_1  | 2018-03-04T21:14:10.573Z        DEBUG   [module]        module/wrapper.go:100      Starting Wrapper[name=docker, len(metricSetWrappers)=1]
metricbeat_1  | 2018-03-04T21:14:10.573Z        INFO    [monitoring]    log/log.go:97      Starting metrics logging every 30s
metricbeat_1  | 2018-03-04T21:14:10.573Z        DEBUG   [module]        module/wrapper.go:146      docker/healthcheck will start after 4.854348903s
metricbeat_1  | 2018-03-04T21:14:15.428Z        DEBUG   [module]        module/wrapper.go:154      Starting metricSetWrapper[module=docker, name=healthcheck, host=/hostfs/var/run/docker.sock]
metricbeat_1  | 2018-03-04T21:14:40.575Z        INFO    [monitoring]    log/log.go:124     Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":20,"time":24},"total":{"ticks":50,"time":60,"value":50},"user":{"ticks":30,"time":36}},"info":{"ephemeral_id":"f01b2c91-2234-4417-bae3-8e7f9b676e44","uptime":{"ms":30074}},"memstats":{"gc_next":4194304,"memory_alloc":2015408,"memory_total":3889640,"rss":30453760}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"logstash"},"pipeline":{"clients":1,"events":{"active":0}}},"system":{"cpu":{"cores":1},"load":{"1":0.31,"15":0.07,"5":0.14,"norm":{"1":0.31,"15":0.07,"5":0.14}}}}}}
metricbeat_1  | 2018-03-04T21:15:10.575Z        INFO    [monitoring]    log/log.go:124     Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":20,"time":24},"total":{"ticks":50,"time":60,"value":50},"user":{"ticks":30,"time":36}},"info":{"ephemeral_id":"f01b2c91-2234-4417-bae3-8e7f9b676e44","uptime":{"ms":60073}},"memstats":{"gc_next":4194304,"memory_alloc":2276288,"memory_total":4150520}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":0}}},"system":{"load":{"1":0.19,"15":0.07,"5":0.12,"norm":{"1":0.19,"15":0.07,"5":0.12}}}}}}

Could you try to only have the healtcheck metricset enabled in your config and let it run a bit longer and share the log output?

Nothing special with the docker version I think. Can you also run docker ps and paste it here? Do your containers expose the healthcheck?

@exekias Any ideas? I think I miss something simple here.

Healthcheck is retrieved from status string, having the output of docker ps will be useful here, as pointed out by @ruflin.

Thanks for the pointers. Turns out I was the one who missed something simple...I did not realize you need to enable healthcheck on the docker container side . I thought it was a metric that was automatically generated (similar to the other metrics, such as "container", "cpu", "memory", etc.). I exposed the healthcheck in a docker container and now metricbeat does indeed publish healthcheck entries.

Thanks for your help!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.