I have been trying to deal with the horrible event managment of metricbeat. I find it a terrible design that the main configuration is logged but when it comes to modules you need to run a special event run to see publish events in debug.
I have been trying to connect to kubernetes on 10250 but the Elastic documentation focuses on 10255 and when referring to bearer token, the example given is fictional. there is no /usr/share/secret in kuberentes. Maybe if this is an example you could give additional commands to rule out RBAC permissions issue? Either ways the logs never tell me anything. How can a system application not produce logs when warning or fatal connection happens? Sounds like bad programming in exception handling?
Lack of information is crippling development and I dread the when it comes to upgrading.... self design complexity?
I STRONGLY encourage Elastic to rethink how modules connect and if they fail to do so, maybe put it into the metricbeat log? I feel this is obvious that if there is a configuration error with one of the modules it should tell me..... irrelevant if I have bebug on.
My problems are,
I am using the kube.config file in the example for "outside cluster config" and I find that I am unable to even tell if its connecting or not... if it lacks permissions or not or really tell anything.
How do I get my metricbeat to collect data from kubelet??
I would say that the configuration at the original post is not being loaded.
I can see the apiserver metricset, but no trace of the other modules.
The apiserver seems to be failing because of a certificate issue when connecting to the endpoint (btw, it puzzles a bit that it is listening on 30555, I'll assume some proxy in front of it ... can you curl and make sure that the apiserver is responding?)
2019-06-24T17:41:21.983Z DEBUG [reload] cfgfile/list.go:101 Starting runner: kubernetes [metricsets=1]
...
2019-06-24T17:41:21.993Z DEBUG [module] module/wrapper.go:179 Starting metricSetWrapper[module=kubernetes, name=apiserver, host=master.kubernetes.sensored.com:30555]
...
"message": "error making http request: Get https://master.kubernetes.sensored.com:30555/metrics: x509: certificate is valid for instance, not master.kubernetes.sensored.com"
The metrics you mention at your post are gathered from the kubelet, you should deploy a daemonset that target all nodes, or you can manually test if you will on one of the nodes. If you are already doing so but the expected metrics are not showing, can you also share the configuration files? sounds like some files are not being "included"
Thank you for reviewing my logs. For clarification, 30555 is elastic search using a node port in kube.
I understand what you mean in relation to the API server, I will resolve that even though that is out of scope of this issue. The apiserver call is pulling pod states... Kublet is pulling system metrics. What I find interesting in what your saying is...
I have both apiserver, stats_metrics pod and kubelet in the same kubernetes.yaml file as provided by metrics beat. I'm confused to how the above example of my kublet config wouldn't work but apiserver call does? Is it giving up and not trying the other module. In the same yml file? If it's hitting an error while loading... Why don't I get an error about this in metricbeat logs? This frustrates the lack of information when things don't go right and scares me to the hell that waits me to upgrade. I find the an undersight from elastic and causes unnecessary complexity. Maybe it's me not used to the debug mode
I will reply in more detail tomorrow morning but again, I have followed the elastic guidelines clearly and used the kubernetes.yml.disabled and made it active.
I have resolved all my issues. I decided to target kubelet 10255 and I was able to get this to gather information. Thank you for showing where in the logs I will be able to work out to debug the authentication when dealing with 10250 as having READONLY without authentication would be considered a CVE issue.
@pmercado I would highly hope you can take my feedback that when an issue like certificate.... authentication or any issue that stops a module from working is in the /var/log/metricbeat/metricbeat log. I have a consistancy issue across all elastic applications that the logs are just insufficient and lack professionism from the industry standards.
Again, thank you for your help. I was feeling helpless and was reading the logs from the wrong direction
My final working config is as following for anyone else having this issue:
State metrics from kube-state-metrics service:
# Node metrics, from kubelet:
- module: kubernetes
metricsets:
- container
- node
- pod
- system
- volume
period: 10s
hosts: ["localhost:10255"]
enabled: true
in_cluster: false
kube_config: /root/.kube/config
- module: kubernetes
kube_config: /root/.kube/config
enabled: true
metricsets:
- state_node
- state_deployment
- state_replicaset
- state_statefulset
- state_pod
- state_container
period: 10s
#kube_state_metrics is on port 30500
hosts: ["master0.kubernetes:30500"]
in_cluster: false
# Enriching parameters:
add_metadata: true
# When used outside the cluster:
# Kubernetes events
- module: kubernetes
kube_config: /root/.kube/config
in_cluster: false
enabled: true
metricsets:
- event
# Kubernetes API server
- module: kubernetes
kube_config: /root/.kube/config
in_cluster: false
enabled: true
metricsets:
- apiserver
hosts: ["http://localhost:8080"]
As you can see, just to prove this works I am forced to disable ssl which is fine as its called via socket but we should all target for https://www.abetterinternet.org/ ISRG to provide connectivity security and elastic does not help users to achieve this as elastic docs only directs users to use most intergration without TLS.
I hope you will reconsider your documentation to provide best practices.
You should be able to generate new certificates for a kubelet client, although this is unlikely the path to go.
Most probably you will want to deploy metribeat as a daemonset, and have the kubelet configured with webhook authorization mode. If that's the case, you will just need to associate the metricbeat pod with a service account that GET the metrics endpoint.
You can create the Role + RoleBinding + SA, and try it with curl before proceeding with the metricbeat daemonset (assuming that will be easier to test).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.