Hello, new user here. I recently deployed ECK on my own server with kubeadm (currently I'm just testing, so I have only one master node where I also schedule pods), using the ECK operator pattern and the quickstart guide. Then I wanted to monitor my kubernetes cluster using the quickstart elastic/kibana, so I turned to Metricbeat with the kubernetes module. I deployed kube-state-metrics as a pod without problem and then deployed metricbeat on kubernetes following (https://www.elastic.co/guide/en/beats/metricbeat/current/running-on-kubernetes.html). Both the deployment and daemonset pods come up just fine, but the daemonset pod is showing errors in its log like:
2020-01-22T13:43:50.000Z INFO module/wrapper.go:252 Error fetching data for metricset kubernetes.volume: error doing HTTP request to fetch 'volume' Metricset data: HTTP error 400 in : 400 Bad Request
2020-01-22T13:43:50.000Z INFO module/wrapper.go:252 Error fetching data for metricset kubernetes.system: error doing HTTP request to fetch 'system' Metricset data: HTTP error 400 in : 400 Bad Request
2020-01-22T13:43:50.100Z INFO module/wrapper.go:252 Error fetching data for metricset kubernetes.node: error doing HTTP request to fetch 'node' Metricset data: HTTP error 400 in : 400 Bad Request
2020-01-22T13:43:50.100Z INFO module/wrapper.go:252 Error fetching data for metricset kubernetes.container: error doing HTTP request to fetch 'container' Metricset data: HTTP error 400 in : 400 Bad Request
2020-01-22T13:43:50.100Z INFO module/wrapper.go:252 Error fetching data for metricset kubernetes.pod: error doing HTTP request to fetch 'pod' Metricset data: HTTP error 400 in : 400 Bad Request
2020-01-22T13:43:56.871Z INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":200,"time":{"ms":102}},"total":{"ticks":1160,"time":{"ms":363},"value":1160},"user":{"ticks":960,"time":{"ms":261}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":12},"info":{"ephemeral_id":"751c1c1e-52f6-4823-b88f-16b1cd5a81a7","uptime":{"ms":90066}},"memstats":{"gc_next":20000144,"memory_alloc":11140640,"memory_total":188857472,"rss":794624},"runtime":{"goroutines":92}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":90,"batches":3,"total":90},"read":{"bytes":21162},"write":{"bytes":120577}},"pipeline":{"clients":4,"events":{"active":0,"published":90,"total":90},"queue":{"acked":90}}},"metricbeat":{"kubernetes":{"container":{"events":3,"failures":3},"node":{"events":3,"failures":3},"pod":{"events":3,"failures":3},"proxy":{"events":9,"success":9},"system":{"events":3,"failures":3},"volume":{"events":3,"failures":3}},"system":{"cpu":{"events":3,"success":3},"load":{"events":3,"success":3},"memory":{"events":3,"success":3},"network":{"events":36,"success":36},"process":{"events":18,"success":18},"process_summary":{"events":3,"success":3}}},"system":{"load":{"1":0.2,"15":0.31,"5":0.25,"norm":{"1":0.0333,"15":0.0517,"5":0.0417}}}}}}
It looks like some requests from the kubernetes module are getting HTTP 400 errors when querying the kubelet service, but I haven't been able to find why. My Metricbeat daemonset config contains the following:
kubernetes.yml: |-
- module: kubernetes
metricsets:
- node
- system
- pod
- container
- volume
period: 10s
host: ${NODE_NAME}
hosts: ["localhost:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl:
# disable verification since kubelet uses self signed certificate
verification_mode: none
certificate_authorities:
- /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
- module: kubernetes
metricsets:
- proxy
period: 10s
host: ${NODE_NAME}
hosts: ["localhost:10249"]
I'm using Metricbeat 7.5.1 (Image: docker.elastic.co/beats/metricbeat:7.5.1), Kubernetes & Kubelet v1.17.1.
Things I tried and didn't work:
- Giving the Metricbeat pod access to the API server kubelet client certificates and key to authenticate (instead of service account token)
- Using $NODE_NAME instead of localhost
- Passing the debug parameter (-d "*") to metricbeat (I hoped this would let me see the requests that were failing with status 400, but it didn't)
Why am I seeing these errors? Am I missing something?
Thanks in advance.