Metricbeat version: 7.7.1
OS version : Flatcar 2303.4.0 and RHEL 7.4
Docker : Docker version 18.06.3-ce, build d7080c1
We also configured docker swarm on these nodes
Config:
#-------------------------------- System Module --------------------------------
- module: system
enabled: true
period: 10s
metricsets:
- cpu
- load
- memory
- network
- process
- process_summary
- socket_summary
- core
- diskio
- socket
process.include_top_n:
by_cpu: 5 # include top 5 processes by CPU
by_memory: 5 # include top 5 processes by memory
- module: system
period: 1m
metricsets:
- filesystem
- fsstat
- service
- users
processors:
- drop_event.when.regexp:
system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'
- module: system
period: 15m
metricsets:
- uptime
#-------------------------------- Docker Module --------------------------------
- module: docker
enabled: true
period: 10s
metricsets:
- container
- cpu
- diskio
- event
- healthcheck
- info
- memory
- network
hosts: ["unix:///var/run/docker.sock"]
The logs after a certain period get full with:
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.169+0200 DEBUG [logstash] logstash/async.go:119 connect
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.177+0200 INFO [publisher_pipeline_output] pipeline/output.go:111 Connection to backoff(async(tcp://cpxlog-elk102.support.local:5045)) established
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.177+0200 DEBUG [logstash] logstash/enc.go:37 Failed to encode event: &{2020-07-08 12:32:02.974424766 +0200 CEST m=+1187.408593395 {} Not valid json: json: error calling MarshalJSON for type common.Float: invalid character 'N' looking for beginning of value <nil> true}
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.177+0200 DEBUG [logstash] logstash/async.go:171 35 events out of 35 events sent to logstash host cpxlog-elk102.support.local:5045. Continue sending
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.177+0200 DEBUG [logstash] logstash/async.go:127 close connection
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.177+0200 DEBUG [transport] transport/client.go:118 closing
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.178+0200 ERROR [logstash] logstash/async.go:279 Failed to publish events caused by: unsupported float value: NaN
Jul 08 13:05:53 sldclr0102 metricbeat[129177]: 2020-07-08T13:05:53.178+0200 DEBUG [logstash] logstash/async.go:127 close connection
It looks like this starts when containers are stopped/started but we can't easily determine which metricset is causing this.