Hello,
We noticed this issue quite long ago (High CPU Usage on some filebeat instances), but we finally had time to dig a bit more on a more recent version of filebeat too (8.10.1).
From what we recently noticed, out logs keeps showing that filebeat get stop/start events on all pods every 10s. This force a reload of the input from what I see in the code, and while the harvester aren't restarted if there is no change, there is still quite alot of processing.
From my checks it seems that this is the code that keeps being called: https://github.com/elastic/beats/blob/v8.10.1/libbeat/autodiscover/providers/kubernetes/pod.go#L185. However, 99% of our pods aren't updated every 10seconds.
So I checked around a bit more and noticed this condition:
https://github.com/elastic/beats/blob/v8.10.1/libbeat/autodiscover/providers/kubernetes/pod.go#L155 here. From what I understand, if we have a nodewatcher (our scope is on "node", so we do have one), and have either hints enable or metaconf.node enabled, this add a watcher on the pod for every node update.
I did a kubectl get nodes --watch and noticed that the nodes are indeed updated every 10s, which seem to be the source of these repeated updates. This comes from kubelet and the nodeStatusUpdateFrequency values, that default at 10s (Kubelet Configuration (v1beta1) | Kubernetes).
From what I understand, the nodewatcher is mostly there to update the labels/annotations from the nodes to add on the logs, however, these update, update the status field of the node mostly.
The condition seems different in the elastic-agent-discover lib, where only the configvalue of the node metadata and the watcher are checked, not the hint presence. https://github.com/elastic/elastic-agent-autodiscover/blob/main/kubernetes/metadata/metadata.go#L101.
This, make the node metadata watcher unable to be disabled if want to use the hints.
Am I understanding all this correctly, and is this on purpose to react to all these updates? We currently have 4Gb of logs from filebeat only just saying it logs along these lines every 10s
{"log.level":"info","@timestamp":"2023-11-03T08:27:23.011Z","log.logger":"input","log.origin":{"file.name":"log/input.go","file.line":174},"message":"Configured paths: [`
As we do not use the node metadata, is there a way to fully disabled that without disabling the hints? Or do we need to workaround and move our few templates defined from hints in the main configuration and load them differently?
Kr,