Metricbeat on kubernetes, conflicting configurations

Been looking at the way to deploy Metricbeat onto our kubernetes cluster and I've been finding conflicting information from two sources.

references the material at https://raw.githubusercontent.com/elastic/beats/7.15/deploy/kubernetes/metricbeat-kubernetes.yaml

Which deploys a Daemonset with some auto discovery and some minimal metricsets.

When we then look at the helm chart helm-charts/metricbeat at master · elastic/helm-charts · GitHub

There is no auto discovery just static module configuration in the Daemonset with a Deployment to collect the kubernetes state metrics itself.

What we're after is a simple enough configuration which does:

  • Collect the metrics of the kubernetes worker nodes
  • Collect the metrics of the kubernetes cluster overall
  • Collect the metrics from all workloads that expose prometheus metrics via standard annotations.

I'm not worried about which approach, just what the recommended and workable one is using metricbeat 7.15.

Found a config that seems to work ok ish on EKS 1.20+.. Have not got the prometheus scraping working yet, getting a lot of errors in the logs with connections refused and the like that'll need further digging.

Raised two bugs on the helm chart repository regarding issues encountered and am using the helm chart with the following config excerpt:

          # enable the http endpoint for health checks
          http:
            enabled: true
            host: localhost
            port: 5066

          # autodiscover kubernetes nodes and collect the metrics from the workers
          metricbeat.autodiscover:
            providers:
              # this uses leader election to have one run as the master and have
              # the leader scrape the kube-state-metrics endpoint and the
              # kubernetes API endpoint
              - type: kubernetes
                scope: cluster
                node: ${NODE_NAME}
                unique: true
                identifier: leader-election-metricbeat
                templates:
                  - config:
                      # kubernetes state metrics
                      - module: kubernetes
                        hosts: ["kube-state-metrics.metrics:8080"]
                        period: 10s
                        add_metadata: true
                        metricsets:
                          - state_node
                          - state_deployment
                          - state_daemonset
                          - state_replicaset
                          - state_pod
                          - state_container
                          - state_job
                          - state_cronjob
                          - state_resourcequota
                          - state_statefulset
                          - state_service
                      # API server metrics
                      - module: kubernetes
                        metricsets:
                          - apiserver
                        hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
                        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                        ssl.certificate_authorities:
                          - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                        period: 30s
                      # kubernetes events
                      - module: kubernetes
                        metricsets:
                          - event

              # hints based autodiscovery allows for using annotations as per the
              # below reference material to cleanly identify what and how to
              # monitor something
              # see: https://www.elastic.co/guide/en/beats/metricbeat/current/configuration-autodiscover-hints.html
              - type: kubernetes
                node: ${NODE_NAME}
                hints.enabled: true

              # # prometheus resource auto discovery
              # - type: kubernetes
              #   include_annotations: ["prometheus.io.scrape"]
              #   templates:
              #     - condition:
              #         contains:
              #           kubernetes.annotations.prometheus.io/scrape: "true"
              #       config:
              #         - module: prometheus
              #           metricsets: ["collector"]
              #           hosts: "${data.host}:${data.port}"

          metricbeat.modules:
            # collect metrics about the worker node we're deployed on in the daemonset
            - module: kubernetes
              metricsets:
                - container
                - node
                - pod
                - system
                - volume
              period: 10s
              host: "${NODE_NAME}"
              hosts: ["https://${NODE_NAME}:10250"]
              bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
              ssl.verification_mode: "none"
              processors:
                - add_kubernetes_metadata: ~

            # # collect metrics from the kube-proxy on this worker node
            # - module: kubernetes
            #   metricsets:
            #     - proxy
            #   period: 10s
            #   host: ${NODE_NAME}
            #   hosts: ["localhost:10249"]
            #   processors:
            #     - add_kubernetes_metadata: ~

            # collect system metrics from the worker node itself
            - module: system
              period: 10s
              metricsets:
                - cpu
                - load
                - memory
                - network
                - process
                - process_summary
              processes: ['.*']
              process.include_top_n:
                by_cpu: 10
                by_memory: 10

            # collect filesystem metrics from the worker node
            - module: system
              period: 1m
              metricsets:
                - filesystem
                - fsstat
              processors:
              - drop_event.when.regexp:
                  system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'

Sofar the hints based and prometheus auto discovery don't work, or are.. incomplete. The configuration settings for the hints based approach seem to only be configurable via the `co.elastic.