I am using filebeat with autodiscover within a Kubernetes cluster to capture logs. When a Kubernetes pod terminates filebeat immediately stops reading log entries which can result in log lines at the end of the logs not being published.
Filebeat configuration is setup as
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
resource: pod
scope: node
hints.enabled: true
hints.default_config:
type: container
paths:
- /var/log/containers/*-${data.container.id}.log
fields_under_root: true
include_annotations:
- app.kubernetes.io/version
- app-release
This issue appears to the same as reported in Filebeat autodiscover should not stop docker prospectors until it reads all lines · Issue #6694 · elastic/beats (github.com) but for the kubernetes input. That was fixed by the introduction of the close_timeout configuration option which is still in place today.
From testing different versions it works as expected up to Filebeat 7.12.1, but is not working as expected from 7.13.0 onwards.
Logs from Filebeat 7.12.1
2023-02-13T20:27:52.120Z DEBUG [autodiscover.pod] kubernetes/pod.go:129 Watcher Pod update for pod: log-provider-0, status: Running
2023-02-13T20:27:52.854Z DEBUG [autodiscover.pod] kubernetes/pod.go:145 Watcher Pod update (terminating)
2023-02-13T20:28:52.856Z DEBUG [autodiscover] autodiscover/autodiscover.go:253 Got a stop event
2023-02-13T20:28:52.857Z INFO input/input.go:136 input ticker stopped
Note the 60 second gap between the pod update event being received from Kubernetes as the pod terminates and the stop event being published within filebeat to stop reading the logs
Logs from Filebeat 7.13.0
2023-02-13T20:26:10.123Z DEBUG [autodiscover.pod] kubernetes/pod.go:145 Watcher Pod update
2023-02-13T20:26:10.128Z DEBUG [autodiscover] autodiscover/autodiscover.go:253 Got a stop event
2023-02-13T20:26:10.129Z INFO input/input.go:136 input ticker stopped
Note the stop event being published immediately
From some investigation I think it relates to changes in Refactor kubernetes autodiscover to avoid skipping short-living pods by jsoriano · Pull Request #24742 · elastic/beats (github.com). Before that change the stop event on pod termination would never be published immediately it was always delayed by the close timeout. Post that change the stop event will be published immediately if any container within the pod has a non-empty running container status. I have no experience in the codebase but wonder if the condition should be podTerminating and not podTerminated