Metricbeat Kubernetes module - all events have failures but no ERROR message in logs

I am using stable/metricbeat Helm Chart for installing metricbeat on Kubernetes cluster

NAME                    CHART VERSION   APP VERSION     DESCRIPTION
stable/metricbeat       1.7.0           6.7.0           A Helm chart to collect Kubernetes logs with metricbeat

Some of the values are overridden by values file:

daemonset:
  overrideConfig:
    ...
    kubernetes:
      config:
        - module: kubernetes
          metricsets:
            - node
            - system
            - pod
            - container
            - volume
          period: 10s
          host: ${NODE_NAME}
          hosts: ["https://${HOSTNAME}:10250"]
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
          ssl.certificate_authorities:
          - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

This setup works correctly to get system metrics to all the way to Kibana. None of the Kubernetes related fields are forwarded.

Looking at metricbeat logs I can see there are failed events, but there are no "ERROR" messages in logs.

In particular

"pod":{"events":2,"failures":2}

For reference if I go directly to kubelet API (https://${HOSTNAME}:10250/metrics or metrics/cadvisor) all metrics are retrieved properly.

How can I debug this or get more accurate error information? What can be the root cause?

Full log for reference (it was too large to fit in original post body)

2019-08-07T12:06:22.114Z        INFO    instance/beat.go:606    Home path: [/usr/share/metricbeat] Config path: [/usr/share/metricbeat] Data path: [/usr/share/metricbeat/data] Logs path: [/usr/share/metricbeat/logs]
2019-08-07T12:06:22.115Z        INFO    instance/beat.go:614    Beat ID: b869998f-ac0b-4b92-b641-99b1b8214d25
2019-08-07T12:06:22.115Z        INFO    [seccomp]       seccomp/seccomp.go:116  Syscall filter successfully installed
2019-08-07T12:06:22.116Z        INFO    [beat]  instance/beat.go:902    Beat info       {"system_info": {"beat": {"path": {"config": "/usr/share/metricbeat", "data": "/usr/share/metricbeat/data", "home": "/usr/share/metricbeat", "logs": "/usr/share/metricbeat/logs"}, "type": "metricbeat", "uuid": "b869998f-ac0b-4b92-b641-99b1b8214d25"}}}
2019-08-07T12:06:22.116Z        INFO    [beat]  instance/beat.go:911    Build info      {"system_info": {"build": {"commit": "9ba65d864ca37cd32c25b980dbb4020975288fc0", "libbeat": "7.2.0", "time": "2019-06-20T15:07:31.000Z", "version": "7.2.0"}}}
2019-08-07T12:06:22.116Z        INFO    [beat]  instance/beat.go:914    Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":4,"version":"go1.12.4"}}}
2019-08-07T12:06:22.119Z        INFO    [beat]  instance/beat.go:918    Host info       {"system_info": {"host": {"architecture":"x86_64","boot_time":"2019-07-25T22:21:27Z","containerized":false,"name":"GDNCSLFRBM0035","ip":["127.0.0.1/8","::1/128","10.22.65.55/22","fe80::250:56ff:fe85:c2ac/64","172.17.0.1/16","fe80::42:9fff:feb7:2ff5/64","fe80::ecee:eeff:feee:eeee/64","10.42.1.0/32","fe80::c456:c6ff:fe80:a2fd/64","fe80::ecee:eeff:feee:eeee/64","fe80::ecee:eeff:feee:eeee/64","fe80::ecee:eeff:feee:eeee/64","fe80::ecee:eeff:feee:eeee/64"],"kernel_version":"4.15.0-38-generic","mac":["00:50:56:85:c2:ac","02:42:9f:b7:2f:f5","ee:ee:ee:ee:ee:ee","c6:56:c6:80:a2:fd","ee:ee:ee:ee:ee:ee","ee:ee:ee:ee:ee:ee","ee:ee:ee:ee:ee:ee","ee:ee:ee:ee:ee:ee"],"os":{"family":"redhat","platform":"centos","name":"CentOS Linux","version":"7 (Core)","major":7,"minor":6,"patch":1810,"codename":"Core"},"timezone":"UTC","timezone_offset_sec":0}}}
2019-08-07T12:06:22.119Z        INFO    [beat]  instance/beat.go:947    Process info    {"system_info": {"process": {"capabilities": {"inheritable":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"permitted":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"effective":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/usr/share/metricbeat", "exe": "/usr/share/metricbeat/metricbeat", "name": "metricbeat", "pid": 1, "ppid": 0, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2019-08-07T12:06:21.540Z"}}}
2019-08-07T12:06:22.119Z        INFO    instance/beat.go:292    Setup Beat: metricbeat; Version: 7.2.0
2019-08-07T12:06:22.120Z        INFO    elasticsearch/client.go:166     Elasticsearch url: https://elasticsearch-es-http:9200
2019-08-07T12:06:22.120Z        INFO    [publisher]     pipeline/module.go:97   Beat name: GDNCSLFRBM0035
2019-08-07T12:06:22.121Z        INFO    [monitoring]    log/log.go:118  Starting metrics logging every 30s
2019-08-07T12:06:22.121Z        INFO    kibana/client.go:118    Kibana url: https://kibana-kb-http:5601
2019-08-07T12:06:22.921Z        INFO    kibana/client.go:118    Kibana url: https://kibana-kb-http:5601
2019-08-07T12:06:23.147Z        INFO    add_cloud_metadata/add_cloud_metadata.go:347    add_cloud_metadata: hosting provider type not detected.
2019-08-07T12:06:52.123Z        INFO    [monitoring]    log/log.go:145  Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":60,"time":{"ms":64}},"total":{"ticks":640,"time":{"ms":646},"value":640},"user":{"ticks":580,"time":{"ms":582}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":6},"info":{"ephemeral_id":"f80769e3-5ba3-4be6-aa6b-3dea3be621a2","uptime":{"ms":30078}},"memstats":{"gc_next":11024752,"memory_alloc":8908408,"memory_total":78420488,"rss":63479808},"runtime":{"goroutines":16}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"elasticsearch"},"pipeline":{"clients":0,"events":{"active":0}}},"system":{"cpu":{"cores":4},"load":{"1":0.28,"15":0.6,"5":0.48,"norm":{"1":0.07,"15":0.15,"5":0.12}}}}}}

Full log for reference (it was too large to fit in original post body)

2019-08-07T12:07:06.797Z        INFO    instance/beat.go:776    Kibana dashboards successfully loaded.
2019-08-07T12:07:06.797Z        INFO    instance/beat.go:421    metricbeat start running.
2019-08-07T12:07:06.798Z        INFO    kubernetes/util.go:71   kubernetes: Using node gdncslfrbm0035 provided in the config
2019-08-07T12:07:06.799Z        INFO    kubernetes/util.go:71   kubernetes: Using node gdncslfrbm0035 provided in the config
2019-08-07T12:07:06.806Z        INFO    filesystem/filesystem.go:57     Ignoring filesystem types: sysfs, rootfs, ramfs, bdev, proc, cpuset, cgroup, cgroup2, tmpfs, devtmpfs, configfs, debugfs, tracefs, securityfs, sockfs, dax, bpf, pipefs, hugetlbfs, devpts, ecryptfs, fuse, fusectl, pstore, mqueue, autofs, overlay, aufs
2019-08-07T12:07:06.806Z        INFO    [system.fsstat] fsstat/fsstat.go:56     Ignoring filesystem types: %ssysfs, rootfs, ramfs, bdev, proc, cpuset, cgroup, cgroup2, tmpfs, devtmpfs, configfs, debugfs, tracefs, securityfs, sockfs, dax, bpf, pipefs, hugetlbfs, devpts, ecryptfs, fuse, fusectl, pstore, mqueue, autofs, overlay, aufs
2019-08-07T12:07:06.806Z        INFO    cfgfile/reload.go:172   Config reloader started
2019-08-07T12:07:06.809Z        INFO    kubernetes/util.go:71   kubernetes: Using node gdncslfrbm0035 provided in the config
2019-08-07T12:07:06.809Z        INFO    kubernetes/util.go:71   kubernetes: Using node gdncslfrbm0035 provided in the config
2019-08-07T12:07:06.810Z        INFO    kubernetes/watcher.go:182       kubernetes: Performing a resource sync for *v1.NodeList
2019-08-07T12:07:06.810Z        INFO    kubernetes/watcher.go:182       kubernetes: Performing a resource sync for *v1.PodList
2019-08-07T12:07:06.811Z        INFO    kubernetes/watcher.go:182       kubernetes: Performing a resource sync for *v1.PodList
2019-08-07T12:07:06.811Z        INFO    filesystem/filesystem.go:57     Ignoring filesystem types: sysfs, rootfs, ramfs, bdev, proc, cpuset, cgroup, cgroup2, tmpfs, devtmpfs, configfs, debugfs, tracefs, securityfs, sockfs, dax, bpf, pipefs, hugetlbfs, devpts, ecryptfs, fuse, fusectl, pstore, mqueue, autofs, overlay, aufs
2019-08-07T12:07:06.812Z        INFO    [system.fsstat] fsstat/fsstat.go:56     Ignoring filesystem types: %ssysfs, rootfs, ramfs, bdev, proc, cpuset, cgroup, cgroup2, tmpfs, devtmpfs, configfs, debugfs, tracefs, securityfs, sockfs, dax, bpf, pipefs, hugetlbfs, devpts, ecryptfs, fuse, fusectl, pstore, mqueue, autofs, overlay, aufs
2019-08-07T12:07:06.812Z        INFO    cfgfile/reload.go:227   Loading of config files completed.
2019-08-07T12:07:06.821Z        INFO    kubernetes/watcher.go:198       kubernetes: Resource sync done
2019-08-07T12:07:06.821Z        INFO    kubernetes/watcher.go:242       kubernetes: Watching API for resource events
2019-08-07T12:07:06.825Z        INFO    kubernetes/watcher.go:198       kubernetes: Resource sync done
2019-08-07T12:07:06.825Z        INFO    kubernetes/watcher.go:242       kubernetes: Watching API for resource events
2019-08-07T12:07:06.835Z        INFO    kubernetes/watcher.go:198       kubernetes: Resource sync done
2019-08-07T12:07:06.836Z        INFO    kubernetes/watcher.go:242       kubernetes: Watching API for resource events
2019-08-07T12:07:07.812Z        INFO    pipeline/output.go:95   Connecting to backoff(elasticsearch(https://elasticsearch-es-http:9200))
2019-08-07T12:07:07.915Z        INFO    elasticsearch/client.go:735     Attempting to connect to Elasticsearch version 7.2.0
2019-08-07T12:07:07.963Z        INFO    template/load.go:88     Template metricbeat already exists and will not be overwritten.
2019-08-07T12:07:07.964Z        INFO    [index-management]      idxmgmt/std.go:289      Loaded index template.
2019-08-07T12:07:07.965Z        INFO    pipeline/output.go:105  Connection to backoff(elasticsearch(https://elasticsearch-es-http:9200)) established
2019-08-07T12:07:22.122Z        INFO    [monitoring]    log/log.go:145  Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":180,"time":{"ms":119}},"total":{"ticks":990,"time":{"ms":354},"value":990},"user":{"ticks":810,"time":{"ms":235}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":9},"info":{"ephemeral_id":"f80769e3-5ba3-4be6-aa6b-3dea3be621a2","uptime":{"ms":60079}},"memstats":{"gc_next":14163488,"memory_alloc":11078728,"memory_total":119480656,"rss":4149248},"runtime":{"goroutines":49}},"libbeat":{"config":{"module":{"running":0},"reloads":1},"output":{"events":{"acked":52,"batches":2,"total":52},"read":{"bytes":13587},"write":{"bytes":75794}},"pipeline":{"clients":3,"events":{"active":0,"filtered":1,"published":52,"retry":29,"total":53},"queue":{"acked":52}}},"metricbeat":{"kubernetes":{"container":{"events":2,"failures":2},"node":{"events":2,"failures":2},"pod":{"events":2,"failures":2},"system":{"events":2,"failures":2},"volume":{"events":2,"failures":2}},"system":{"cpu":{"events":2,"success":2},"filesystem":{"events":1,"success":1},"fsstat":{"events":1,"success":1},"load":{"events":2,"success":2},"memory":{"events":2,"success":2},"network":{"events":18,"success":18},"process":{"events":15,"success":15},"process_summary":{"events":2,"success":2}}},"system":{"load":{"1":0.39,"15":0.59,"5":0.48,"norm":{"1":0.0975,"15":0.1475,"5":0.12}}}}}}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.