Been looking at the way to deploy Metricbeat onto our kubernetes cluster and I've been finding conflicting information from two sources.
references the material at https://raw.githubusercontent.com/elastic/beats/7.15/deploy/kubernetes/metricbeat-kubernetes.yaml
Which deploys a Daemonset with some auto discovery and some minimal metricsets.
When we then look at the helm chart helm-charts/metricbeat at master · elastic/helm-charts · GitHub
There is no auto discovery just static module configuration in the Daemonset with a Deployment to collect the kubernetes state metrics itself.
What we're after is a simple enough configuration which does:
- Collect the metrics of the kubernetes worker nodes
- Collect the metrics of the kubernetes cluster overall
- Collect the metrics from all workloads that expose prometheus metrics via standard annotations.
I'm not worried about which approach, just what the recommended and workable one is using metricbeat 7.15.
Found a config that seems to work ok ish on EKS 1.20+.. Have not got the prometheus scraping working yet, getting a lot of errors in the logs with connections refused and the like that'll need further digging.
Raised two bugs on the helm chart repository regarding issues encountered and am using the helm chart with the following config excerpt:
# enable the http endpoint for health checks
http:
enabled: true
host: localhost
port: 5066
# autodiscover kubernetes nodes and collect the metrics from the workers
metricbeat.autodiscover:
providers:
# this uses leader election to have one run as the master and have
# the leader scrape the kube-state-metrics endpoint and the
# kubernetes API endpoint
- type: kubernetes
scope: cluster
node: ${NODE_NAME}
unique: true
identifier: leader-election-metricbeat
templates:
- config:
# kubernetes state metrics
- module: kubernetes
hosts: ["kube-state-metrics.metrics:8080"]
period: 10s
add_metadata: true
metricsets:
- state_node
- state_deployment
- state_daemonset
- state_replicaset
- state_pod
- state_container
- state_job
- state_cronjob
- state_resourcequota
- state_statefulset
- state_service
# API server metrics
- module: kubernetes
metricsets:
- apiserver
hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.certificate_authorities:
- /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
period: 30s
# kubernetes events
- module: kubernetes
metricsets:
- event
# hints based autodiscovery allows for using annotations as per the
# below reference material to cleanly identify what and how to
# monitor something
# see: https://www.elastic.co/guide/en/beats/metricbeat/current/configuration-autodiscover-hints.html
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
# # prometheus resource auto discovery
# - type: kubernetes
# include_annotations: ["prometheus.io.scrape"]
# templates:
# - condition:
# contains:
# kubernetes.annotations.prometheus.io/scrape: "true"
# config:
# - module: prometheus
# metricsets: ["collector"]
# hosts: "${data.host}:${data.port}"
metricbeat.modules:
# collect metrics about the worker node we're deployed on in the daemonset
- module: kubernetes
metricsets:
- container
- node
- pod
- system
- volume
period: 10s
host: "${NODE_NAME}"
hosts: ["https://${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none"
processors:
- add_kubernetes_metadata: ~
# # collect metrics from the kube-proxy on this worker node
# - module: kubernetes
# metricsets:
# - proxy
# period: 10s
# host: ${NODE_NAME}
# hosts: ["localhost:10249"]
# processors:
# - add_kubernetes_metadata: ~
# collect system metrics from the worker node itself
- module: system
period: 10s
metricsets:
- cpu
- load
- memory
- network
- process
- process_summary
processes: ['.*']
process.include_top_n:
by_cpu: 10
by_memory: 10
# collect filesystem metrics from the worker node
- module: system
period: 1m
metricsets:
- filesystem
- fsstat
processors:
- drop_event.when.regexp:
system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
Sofar the hints based and prometheus auto discovery don't work, or are.. incomplete. The configuration settings for the hints based approach seem to only be configurable via the `co.elastic.