Filebeat autodiscover stopping too early when kubernetes pod terminates

I am using filebeat with autodiscover within a Kubernetes cluster to capture logs. When a Kubernetes pod terminates filebeat immediately stops reading log entries which can result in log lines at the end of the logs not being published.

Filebeat configuration is setup as

filebeat.autodiscover:
  providers:
    - type: kubernetes
      node: ${NODE_NAME}
      resource: pod
      scope: node
      hints.enabled: true
      hints.default_config:
        type: container
        paths:
          - /var/log/containers/*-${data.container.id}.log
        fields_under_root: true          
      include_annotations:
        - app.kubernetes.io/version
        - app-release

This issue appears to the same as reported in Filebeat autodiscover should not stop docker prospectors until it reads all lines · Issue #6694 · elastic/beats (github.com) but for the kubernetes input. That was fixed by the introduction of the close_timeout configuration option which is still in place today.

From testing different versions it works as expected up to Filebeat 7.12.1, but is not working as expected from 7.13.0 onwards.

Logs from Filebeat 7.12.1

2023-02-13T20:27:52.120Z	DEBUG	[autodiscover.pod]	kubernetes/pod.go:129	Watcher Pod update for pod: log-provider-0, status: Running
2023-02-13T20:27:52.854Z	DEBUG	[autodiscover.pod]	kubernetes/pod.go:145	Watcher Pod update (terminating)
2023-02-13T20:28:52.856Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:253	Got a stop event
2023-02-13T20:28:52.857Z	INFO	input/input.go:136	input ticker stopped

Note the 60 second gap between the pod update event being received from Kubernetes as the pod terminates and the stop event being published within filebeat to stop reading the logs

Logs from Filebeat 7.13.0

2023-02-13T20:26:10.123Z	DEBUG	[autodiscover.pod]	kubernetes/pod.go:145	Watcher Pod update
2023-02-13T20:26:10.128Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:253	Got a stop event
2023-02-13T20:26:10.129Z	INFO	input/input.go:136	input ticker stopped

Note the stop event being published immediately

From some investigation I think it relates to changes in Refactor kubernetes autodiscover to avoid skipping short-living pods by jsoriano · Pull Request #24742 · elastic/beats (github.com). Before that change the stop event on pod termination would never be published immediately it was always delayed by the close timeout. Post that change the stop event will be published immediately if any container within the pod has a non-empty running container status. I have no experience in the codebase but wonder if the condition should be podTerminating and not podTerminated

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.