Found a config that seems to work ok ish on EKS 1.20+.. Have not got the prometheus scraping working yet, getting a lot of errors in the logs with connections refused and the like that'll need further digging.
Raised two bugs on the helm chart repository regarding issues encountered and am using the helm chart with the following config excerpt:
# enable the http endpoint for health checks
http:
enabled: true
host: localhost
port: 5066
# autodiscover kubernetes nodes and collect the metrics from the workers
metricbeat.autodiscover:
providers:
# this uses leader election to have one run as the master and have
# the leader scrape the kube-state-metrics endpoint and the
# kubernetes API endpoint
- type: kubernetes
scope: cluster
node: ${NODE_NAME}
unique: true
identifier: leader-election-metricbeat
templates:
- config:
# kubernetes state metrics
- module: kubernetes
hosts: ["kube-state-metrics.metrics:8080"]
period: 10s
add_metadata: true
metricsets:
- state_node
- state_deployment
- state_daemonset
- state_replicaset
- state_pod
- state_container
- state_job
- state_cronjob
- state_resourcequota
- state_statefulset
- state_service
# API server metrics
- module: kubernetes
metricsets:
- apiserver
hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.certificate_authorities:
- /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
period: 30s
# kubernetes events
- module: kubernetes
metricsets:
- event
# hints based autodiscovery allows for using annotations as per the
# below reference material to cleanly identify what and how to
# monitor something
# see: https://www.elastic.co/guide/en/beats/metricbeat/current/configuration-autodiscover-hints.html
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
# # prometheus resource auto discovery
# - type: kubernetes
# include_annotations: ["prometheus.io.scrape"]
# templates:
# - condition:
# contains:
# kubernetes.annotations.prometheus.io/scrape: "true"
# config:
# - module: prometheus
# metricsets: ["collector"]
# hosts: "${data.host}:${data.port}"
metricbeat.modules:
# collect metrics about the worker node we're deployed on in the daemonset
- module: kubernetes
metricsets:
- container
- node
- pod
- system
- volume
period: 10s
host: "${NODE_NAME}"
hosts: ["https://${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none"
processors:
- add_kubernetes_metadata: ~
# # collect metrics from the kube-proxy on this worker node
# - module: kubernetes
# metricsets:
# - proxy
# period: 10s
# host: ${NODE_NAME}
# hosts: ["localhost:10249"]
# processors:
# - add_kubernetes_metadata: ~
# collect system metrics from the worker node itself
- module: system
period: 10s
metricsets:
- cpu
- load
- memory
- network
- process
- process_summary
processes: ['.*']
process.include_top_n:
by_cpu: 10
by_memory: 10
# collect filesystem metrics from the worker node
- module: system
period: 1m
metricsets:
- filesystem
- fsstat
processors:
- drop_event.when.regexp:
system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
Sofar the hints based and prometheus auto discovery don't work, or are.. incomplete. The configuration settings for the hints based approach seem to only be configurable via the `co.elastic.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.