### Setup
Observed in Filebeat 8.5.3 running in Azure Kubernetes `v1.24.6` wi…th the following autodiscover configuration:
<details>
```
filebeat:
autodiscover:
providers:
- templates:
- condition:
or:
- equals:
kubernetes:
container:
name: elasticsearch
- equals:
kubernetes:
container:
name: kibana
config:
- paths:
- /var/log/pods/${data.kubernetes.namespace}_${data.kubernetes.pod.name}_${data.kubernetes.pod.uid}/${data.kubernetes.container.name}/*.log
processors:
- decode_json_fields:
add_error_key: true
fields:
- message
max_depth: 1
overwrite_keys: true
process_array: false
target: ""
type: container
type: kubernetes
```
</details>
### Context
While addressing https://github.com/elastic/beats/issues/23139 Node and Namespace watchers were added in Autodicover to make Node/Namespace metadata changes instantly available to the related Pods. Currently, when a Node modification is received via a watcher, all Pods located on this Node are retrieved from a watcher store ([src](https://github.com/elastic/elastic-agent-autodiscover/blob/v0.3.0/kubernetes/eventhandler.go#L202)) and stop and start Pod events are emitted ([src](https://github.com/elastic/beats/blob/v8.5.3/libbeat/autodiscover/providers/kubernetes/pod.go#L158)). A single Pod can emit multiple start and stop events ([src](https://github.com/elastic/beats/blob/v8.5.3/libbeat/autodiscover/providers/kubernetes/pod.go#L278)). This happens regardless of the nature of the change of the Node object.
In the environment specified in `Setup` section, it was observed that Node objects can be updated as frequently as **every 10 seconds**. Filebeat logs do not allow to determine what exactly was received via watch APIs, but this can be verified as follows:
```
kubectl proxy &
stdbuf -oL curl -s '127.0.0.1:8001/api/v1/namespaces?allowWatchBookmarks=true&watch=true&pretty=false' > namespaces.log &
stdbuf -oL curl -s '127.0.0.1:8001/api/v1/nodes?allowWatchBookmarks=true&watch=true&pretty=false' > nodes.log &
stdbuf -oL curl -s '127.0.0.1:8001/api/v1/pods?allowWatchBookmarks=true&watch=true&pretty=false' > pods.log &
```
Node objects were modified in `.status.conditions[]` array which is related to Node monitoring process. The only field modified was `lastHeartbeatTime`. Here's an example of message received via watch API:
<details>
```
{
"type": "MODIFIED",
"object": {
"kind": "Node",
"apiVersion": "v1",
"metadata": {
"name": "<REDACTED>", ...
},
"spec": { ...
},
"status": {
"capacity": { ...
},
"allocatable": { ...
},
"conditions": [
{
"type": "FilesystemCorruptionProblem",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-09T07:41:54Z",
"reason": "FilesystemIsOK",
"message": "Filesystem is healthy"
},
{
"type": "ContainerRuntimeProblem",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-10T20:37:25Z",
"reason": "ContainerRuntimeIsUp",
"message": "container runtime service is up"
},
{
"type": "KubeletProblem",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-10T20:37:25Z",
"reason": "KubeletIsUp",
"message": "kubelet service is up"
},
{
"type": "FrequentDockerRestart",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-09T07:41:54Z",
"reason": "NoFrequentDockerRestart",
"message": "docker is functioning properly"
},
{
"type": "FrequentUnregisterNetDevice",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-09T07:41:54Z",
"reason": "NoFrequentUnregisterNetDevice",
"message": "node is functioning properly"
},
{
"type": "VMEventScheduled",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-10T20:37:45Z",
"reason": "NoVMEventScheduled",
"message": "VM has no scheduled event"
},
{
"type": "ReadonlyFilesystem",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-09T07:41:54Z",
"reason": "FilesystemIsNotReadOnly",
"message": "Filesystem is not read-only"
},
{
"type": "FrequentContainerdRestart",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-09T07:41:54Z",
"reason": "NoFrequentContainerdRestart",
"message": "containerd is functioning properly"
},
{
"type": "KernelDeadlock",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-09T07:41:54Z",
"reason": "KernelHasNoDeadlock",
"message": "kernel has no deadlock"
},
{
"type": "FrequentKubeletRestart",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:10:59Z", <--- HERE
"lastTransitionTime": "2023-02-09T07:41:54Z",
"reason": "NoFrequentKubeletRestart",
"message": "kubelet is functioning properly"
},
{
"type": "MemoryPressure",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:15:15Z", <--- HERE
"lastTransitionTime": "2023-02-10T20:38:18Z",
"reason": "KubeletHasSufficientMemory",
"message": "kubelet has sufficient memory available"
},
{
"type": "DiskPressure",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:15:15Z", <--- HERE
"lastTransitionTime": "2023-02-10T20:38:18Z",
"reason": "KubeletHasNoDiskPressure",
"message": "kubelet has no disk pressure"
},
{
"type": "PIDPressure",
"status": "False",
"lastHeartbeatTime": "2023-02-27T08:15:15Z", <--- HERE
"lastTransitionTime": "2023-02-10T20:38:18Z",
"reason": "KubeletHasSufficientPID",
"message": "kubelet has sufficient PID available"
},
{
"type": "Ready",
"status": "True",
"lastHeartbeatTime": "2023-02-27T08:15:15Z", <--- HERE
"lastTransitionTime": "2023-02-10T20:38:18Z",
"reason": "KubeletReady",
"message": "kubelet is posting ready status. AppArmor enabled"
}
],
"addresses": [ ...
],
"daemonEndpoints": { ...
},
"nodeInfo": { ...
},
"images": [ ...
],
"volumesInUse": [ ...
],
"volumesAttached": [ ...
]
}
}
}
```
</details>
**As Node metadata is not modified, the entire configuration reload process that follows is unnecessary.**
Here's an excerpt from Filebeat debug log that shows that every 10s a sequence of 4 stop and 4 start events is emitted, just for a single Pod (there are many Pods on a single Node, so this is multiplied):
```
2023-02-14T12:03:02.406Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.406Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.407Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.407Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.604Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.604Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.605Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.607Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.091Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.092Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.093Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.093Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.099Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.100Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.101Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:12.102Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.415Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.415Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.416Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.417Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.715Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.716Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.717Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:22.718Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
```
With the above template configuration most of these start/stop should be ignored, but not events for Kibana Pods. Here is an interesting debug log excerpt which shows a sequence of stop and start events that lead to a runner error potentially caused by `log` input race condition (see https://github.com/elastic/beats/issues/34388#issuecomment-1439816785). Once such runner error occurs Autodiscover worker initiates a configuration reload on _every_ received event ([src](https://github.com/elastic/beats/blob/v8.5.3/libbeat/autodiscover/autodiscover.go#L133)).
```
2023-02-14T12:03:02.406Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.406Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.406Z cfgfile/list.go 64 Starting reload procedure, current runners: 3
2023-02-14T12:03:02.406Z cfgfile/list.go 82 Start list: 0, Stop list: 1
2023-02-14T12:03:02.407Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.407Z cfgfile/list.go 64 Starting reload procedure, current runners: 2
2023-02-14T12:03:02.407Z cfgfile/list.go 82 Start list: 0, Stop list: 1
2023-02-14T12:03:02.407Z autodiscover/autodiscover.go 267 Got a stop event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.407Z cfgfile/list.go 64 Starting reload procedure, current runners: 1
2023-02-14T12:03:02.407Z cfgfile/list.go 82 Start list: 0, Stop list: 1
2023-02-14T12:03:02.604Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.604Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.605Z cfgfile/list.go 64 Starting reload procedure, current runners: 0
2023-02-14T12:03:02.605Z cfgfile/list.go 82 Start list: 1, Stop list: 0
2023-02-14T12:03:02.605Z cfgfile/list.go 107 Error creating runner from config: Can only start an input when all related states are finished: {Id: native::4927659-2049, Finished: false, Fileinfo: &{0.log 31895141 416 {696363713 63811972980 0xaaaad7612520} {2049 4927659 33184 1 0 0 0 0 31895141 4096 0 62304 {1676357648 659620958} {1676376180 696363713} {1676376180 696363713} [0 0]}}, Source: /var/log/pods/elastic_esdevazk8swe1-3-kb-84795bfcd6-cd984_340bb25a-a329-4d4b-b934-e8605c3d4ee8/kibana/0.log, Offset: 29901889, Timestamp: 2023-02-14 12:03:01.723335032 +0000 UTC m=+149.690897691, TTL: -1ns, Type: container, Meta: map[], FileStateOS: 4927659-2049}
2023-02-14T12:03:02.605Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.606Z autodiscover/autodiscover.go 156 Reloading existing autodiscover configs after error
2023-02-14T12:03:02.606Z cfgfile/list.go 64 Starting reload procedure, current runners: 0
2023-02-14T12:03:02.606Z cfgfile/list.go 82 Start list: 2, Stop list: 0
2023-02-14T12:03:02.607Z cfgfile/list.go 107 Error creating runner from config: Can only start an input when all related states are finished: {Id: native::4927659-2049, Finished: false, Fileinfo: &{0.log 31895141 416 {696363713 63811972980 0xaaaad7612520} {2049 4927659 33184 1 0 0 0 0 31895141 4096 0 62304 {1676357648 659620958} {1676376180 696363713} {1676376180 696363713} [0 0]}}, Source: /var/log/pods/elastic_esdevazk8swe1-3-kb-84795bfcd6-cd984_340bb25a-a329-4d4b-b934-e8605c3d4ee8/kibana/0.log, Offset: 29901889, Timestamp: 2023-02-14 12:03:01.723335032 +0000 UTC m=+149.690897691, TTL: -1ns, Type: container, Meta: map[], FileStateOS: 4927659-2049}
2023-02-14T12:03:02.607Z autodiscover/autodiscover.go 182 Got a start event. esdevazk8swe1-3-kb-84795bfcd6-cd984
2023-02-14T12:03:02.608Z autodiscover/autodiscover.go 156 Reloading existing autodiscover configs after error
2023-02-14T12:03:02.608Z cfgfile/list.go 64 Starting reload procedure, current runners: 1
2023-02-14T12:03:02.608Z cfgfile/list.go 82 Start list: 2, Stop list: 0
2023-02-14T12:03:02.609Z cfgfile/list.go 107 Error creating runner from config: Can only start an input when all related states are finished: {Id: native::4927659-2049, Finished: false, Fileinfo: &{0.log 31895141 416 {696363713 63811972980 0xaaaad7612520} {2049 4927659 33184 1 0 0 0 0 31895141 4096 0 62304 {1676357648 659620958} {1676376180 696363713} {1676376180 696363713} [0 0]}}, Source: /var/log/pods/elastic_esdevazk8swe1-3-kb-84795bfcd6-cd984_340bb25a-a329-4d4b-b934-e8605c3d4ee8/kibana/0.log, Offset: 29901889, Timestamp: 2023-02-14 12:03:01.723335032 +0000 UTC m=+149.690897691, TTL: -1ns, Type: container, Meta: map[], FileStateOS: 4927659-2049}
2023-02-14T12:03:02.609Z autodiscover/autodiscover.go 267 Got a stop event. aad-pod-identity-nmi-5jm82
2023-02-14T12:03:02.609Z autodiscover/autodiscover.go 156 Reloading existing autodiscover configs after error <--- HERE
2023-02-14T12:03:02.609Z cfgfile/list.go 64 Starting reload procedure, current runners: 2
2023-02-14T12:03:02.609Z cfgfile/list.go 82 Start list: 1, Stop list: 0
2023-02-14T12:03:02.610Z cfgfile/list.go 107 Error creating runner from config: Can only start an input when all related states are finished: {Id: native::4927659-2049, Finished: false, Fileinfo: &{0.log 31895141 416 {696363713 63811972980 0xaaaad7612520} {2049 4927659 33184 1 0 0 0 0 31895141 4096 0 62304 {1676357648 659620958} {1676376180 696363713} {1676376180 696363713} [0 0]}}, Source: /var/log/pods/elastic_esdevazk8swe1-3-kb-84795bfcd6-cd984_340bb25a-a329-4d4b-b934-e8605c3d4ee8/kibana/0.log, Offset: 29901889, Timestamp: 2023-02-14 12:03:01.723335032 +0000 UTC m=+149.690897691, TTL: -1ns, Type: container, Meta: map[], FileS
tateOS: 4927659-2049}
2023-02-14T12:03:02.610Z autodiscover/autodiscover.go 267 Got a stop event. aad-pod-identity-nmi-5jm82
2023-02-14T12:03:02.610Z autodiscover/autodiscover.go 156 Reloading existing autodiscover configs after error <--- HERE
```
Ultimately, the following frequency of configuration reloads was observed:
```
1 2023-02-14T12:03:01
66 2023-02-14T12:03:02
113 2023-02-14T12:03:12 <--- HERE
27 2023-02-14T12:03:13
6 2023-02-14T12:03:22
5 2023-02-14T12:03:32
79 2023-02-14T12:03:33
1 2023-02-14T12:03:42
28 2023-02-14T12:03:43
1 2023-02-14T12:03:52
66 2023-02-14T12:03:53
...
```
### Enhancement request
Autodiscover should filter Node and Namespace object modifications that are not changing its metadata.
### Workarounds
If `hints` are not used, Node and Namespace watchers can be disabled entirely ([src](https://github.com/elastic/beats/blob/v8.5.3/libbeat/autodiscover/providers/kubernetes/pod.go#L128-L136)) with the following configuration:
```
filebeat:
autodiscover:
providers:
- add_resource_metadata:
namespace:
enabled: false
node:
enabled: false
hints.enabled: false
```