Getting these errors when trying to install metricbeat as a daemonset using terraform.
What could be the issue, metricbeat or AKS ?
2022-05-03T20:09:01.492Z ERROR module/wrapper.go:259 Error fetching data for metricset kubernetes.system: error doing HTTP request to fetch 'system' Metricset data: error making http request: Get "{my_aks_url}:10250/stats/summary": dial tcp {my_aks_IP}:10250: i/o timeout
2022-05-03T20:09:01.492Z ERROR module/wrapper.go:259 Error fetching data for metricset kubernetes.volume: error doing HTTP request to fetch 'volume' Metricset data: error making http request: Get "{my_aks_url}10250/stats/summary": dial tcp {my_aks_IP}:10250: i/o timeout
2022-05-03T20:09:01.612Z ERROR [kubernetes.container] container/container.go:93 error making http request: Get "{my_aks_url}:10250/stats/summary": dial tcp {my_aks_IP}:10250: i/o timeout
Hi @surprised_ferret
Are you using the default configuration hosts: ["https://${NODE_NAME}:10250"]
beats/metricbeat-kubernetes.yaml at main · elastic/beats · GitHub for this field?
you can try to exec to the one of the metricbeat pods and try to access kubelet:
kubectl exec -it <metricbeat-pod> bash
token=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl -k -H "Authorization: Bearer $(echo $token)" "https://${NODE_NAME}:10250/stats/summary"
does it fail with the same error?
I am passing in this yml to terraform:
daemonset:
metricbeatConfig:
metricbeat.yml: |
metricbeat.modules:
- module: kubernetes
metricsets:
- container
- node
- pod
- system
- volume
period: 10s
host: "${NODE_NAME}" # passed in as azurerm_kubernetes_cluster.global.fqdn
hosts: ["${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none"
processors:
- add_kubernetes_metadata: ~
- module: kubernetes
enabled: true
metricsets:
- event
- module: system
period: 10s
metricsets:
- cpu
- load
- memory
- network
- process
- process_summary
processes: ['.*']
process.include_top_n:
by_cpu: 5
by_memory: 5
- module: system
period: 1m
metricsets:
- filesystem
- fsstat
processors:
- drop_event.when.regexp:
system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
output.elasticsearch:
protocol: https
hosts: '${ELASTICSEARCH_HOSTS}' # passed in as ec_deployment.global.elasticsearch.0.https_endpoint
username: '${ELASTICSEARCH_USERNAME}' # passed in as elasticstack_elasticsearch_security_user.metricbeat_user.username
password: '${ELASTICSEARCH_PASSWORD}' # passed in as elasticstack_elasticsearch_security_user.metricbeat_user.password
logging.metrics.enabled: false
I tried to log into a pod within the metricbeat daemonset, but it says 'not found'.
I also tried to log into a pod within the metricbeat-metricbeat-metrics deployment, but same issue.
Actually hold on, I needed to pass in namespace hold please...
Getting timeout here as well @Tetiana_Kravchenko
root@metricbeat-metricbeat-78pt9:/usr/share/metricbeat# curl -k -H "Authorization: Bearer $(echo $token)" "https://{my_node_name}.hcp.centralus.azmk8s.io:10250/stats/summary"
curl: (28) Failed to connect to {my_node_name}.hcp.centralus.azmk8s.io port 10250: Connection timed out
I think it might be related to the AKS internals - here is mentioned similar issue in AKS documentation
system
(system)
Closed
June 2, 2022, 2:33pm
7
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.