Error when using autodiscover + hints + templates with filebeat 8.6

Hi,

Thought I should check in with the community here before creating a github issue, just in case there's something I'm not understanding correctly;

After upgrading our filebeat kubernetes daemonset from 7.17.8 to 8.6.0 I can't seem to get rid of errors of this type:

log.level: error
log.logger: autodiscover
log.origin.file.line: 109
log.origin.file.name: cfgfile/list.go

message:
 Error creating runner from config: failed to create input: Can only start an input when all related states are finished: {Id: ea745ab688be85a9-native::3884203-2049, Finished: false, Fileinfo: &{frontend-86c8579b5b-mhnpg_helpdesk-frontend_frontend-mgmt-1cc73434a92abe9b93d9a3d971cfc4182e8ce64ac0e03f0c6e395875236fd514.log 374 416 {204820038 63804978609 0x56347552d700} {2049 3884203 1 33184 0 0 0 0 374 4096 8 {1669381808 728813408} {1669381809 204820038} {1669381809 204820038} [0 0 0]}}, Source: /var/log/containers/frontend-86c8579b5b-mhnpg_helpdesk-frontend_frontend-mgmt-1cc73434a92abe9b93d9a3d971cfc4182e8ce64ac0e03f0c6e395875236fd514.log, Offset: 0, Timestamp: 2023-01-19 13:38:27.166489276 +0000 UTC m=+58865.698641043, TTL: -1ns, Type: container, Meta: map[stream:stdout], FileStateOS: 3884203-2049}

The number of errors varies depending of the number of pods deployed. In our current prod cluster I'm observing roughly 60k messages pr. 24h.

Filebeat is currently deployed as a daemonset using the official helm chart version 8.5.1 and running in Azure AKS, kubernetes version 1.24.6.

This is the relevant part our current filebeat configuration (I've excluded output.* and setup.*):

        logging:
          level: warning
          metrics.enabled: false
          json: true
        processors:
          - # disable logs from select sources
            drop_event.when.or:
              - equals.kubernetes.labels.app: "secrets-store-csi-driver"
              - equals.kubernetes.labels.app: "secrets-store-provider-azure"
              - equals.kubernetes.labels.app: "konnectivity-agent"
        filebeat.autodiscover:
          providers:
            - type: kubernetes
              node: ${NODE_NAME}
              cleanup_timeout: 2m
              hints.enabled: true
              hints.default_config:
                type: container
                paths:
                  - /var/log/containers/*-${data.kubernetes.container.id}.log
              templates:
                - # nginx logs: configure the filebeat nginx module
                  condition.equals:
                    # This pod annotation must be set on the app during deployment for this template to be applied
                    # See available fields for matching here:
                    #   https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html#_kubernetes
                    kubernetes.annotations.no.dsb-norge.filebeat/autodiscover-template: nginx
                  config:
                    - module: nginx
                      ingress_controller:
                        enabled: false
                      access:
                        enabled: true
                        input:
                          type: container
                          stream: stdout
                          paths:
                            - /var/log/containers/*-${data.kubernetes.container.id}.log
                      error:
                        enabled: true
                        input:
                          type: container
                          stream: stderr
                          paths:
                            - /var/log/containers/*-${data.kubernetes.container.id}.log

I'm able to avoid the error by either removing the templates:-config or by disabling the hints.default_config. Either of these are not suitable solutions for us as they result in missing logs or logs not being parsed correctly.

The error messages all refer to log files from our nginx pods. We have multiple other types of deployments without issues indicated by filebeat. Since we are using the nginx module conditionally for these pods this leads me to think there's som kind of race condition happening when the nginx module is applied with templates-config in combination with default hints configuration.

We were able to achieve autodiscover with hints (including default_config) and templates using filebeat 7.17.8 just fine without errors. Running on the same kubernetes version and deployed with official helm chart version 7.17.3.

First I thought maybe I was experiencing github Issue #11834: [autodiscover] Error creating runner from config: Can only start an input when all related states are finished. But after reading I saw that this was fixed in filebeat 7.9.0. Reading a bit further I saw that there were a couple of more issues resulting in this error but those had also been fixed:

I have verified that we are not missing log entries and therefore I'm suspecting that my issue is also a "recoverable error", and that it should possibly not be logged on error level :woman_shrugging:

Anyways, fingers crossed that any of you have experienced something similar or that you can spot an issue in our configuraiton :slight_smile:

Could the behavior we are observing be related to github Issue #33653: Input reload not working as expected under Elastic-Agent?

We are not using agent/fleet. With my limited understanding I'm not able conclude if it's related or not. Considering opening a new github issue :thinking:

Since there was no response here, I've created github Issue #34388: Filebeat: Error when using autodiscover + hints + templates with filebeat 8.6