Filebeat add_kubernetes_metadata pod uid matcher doesn't match poduid for rotated file on AKS

Hi,

As the title suggests, I believe when using pod UID (/var/log/pod) with the below config which was fixed and backported into release v7.17.1, it still cannot get the metadata for the second file when rotated in AKS by kubelet. Not sure if this affects other managed kubernetes instances as we only use AKS.

I believe the issue is here, as when AKS rotates the file it adds an numbers onto the end i.e. "*.log.12451251" meaning filebeat then skips from line 101 down to 135 meaning no poduid is extracted therefore cannot be matched. Is there a reason this isn't the below instead so it would pickup both scenarios?

if strings.Contains(source, ".log")

Error from filebeat debug logs:

2022-03-07T07:06:07.991Z	DEBUG	[kubernetes]	add_kubernetes_metadata/kubernetes.go:278	Index key debrief-test_debrief-eventhub-consumer-dws-12ffasfgag44-w9cvs_da5d did not match any of the cached resources	{"libbeat.processor": "add_kubernetes_metadata"}

Filebeat config file:

---
filebeatConfig:
    filebeat.yml: |
      filebeat.inputs:
      - type: container
        exclude_files: ['.gz$']
        paths:
          - /var/log/pods/*/*/*.log*
        multiline.type: pattern
        multiline.pattern: '^[[:space:]]|^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
        multiline.negate: false
        multiline.match: after
        processors:
        - decode_json_fields:
            when.regexp.message: '^{'
            fields: ["message"]
            process_array: false
            max_depth: 1
            target: "parsed"
            overwrite_keys: false
            add_error_key: true
        - add_labels:
            labels:
              k8s_cluster: "${cluster_name}"
        - add_kubernetes_metadata:
            host: $${NODE_NAME}
            default_matchers.enabled: false
            default_indexer.enabled: false
            indexers:
              - pod_uid:
            matchers:
              - logs_path:
                  logs_path: "/var/log/pods/"
                  resource_type: "pod"
        - drop_fields:
            fields: ["container.id", "agent.id", "agent.ephemeral_id", "container.runtime", "ecs.version", "input.type", "kubernetes.labels.pod-template-hash", "kubernetes.pod.uid", "kubernetes.labels.app_kubernetes_io/config-sha256"]
            ignore_missing: true
      output.elasticsearch:
        hosts: ["${elastic_url}"]
        index: "k8s-logs-${unlabelled_name}-%%{[agent.version]}"
        ${indices}
        username: "${elastic_username}"
        password: "${elastic_password}"
      queue.mem:
        events: 16384
      # 10MB for harvester size  
      harvester_buffer_size: 10485760
      setup.template.name: "ds-k8s-logs-template"
      setup.template.pattern: "k8s-logs*"
      setup.ilm.enabled: false
      logging.level: ${logging_level}
      http.enabled: ${http_enabled}
      http.host: localhost
      http.port: 5066

My belief is the ones that are getting this error is down to it not being able to see the rotated files poduid due to it ending in ".log.124312" rather than simply ".log".

In the meantime I am going to try the autodiscover configuration to see if that helps matters as I believe this discovers kubernetes metadata differently.

Thanks in advance.

Hi @Jayw77,
I think you do not need to process rotated log files and use:

paths:
  - /var/log/pods/*/*/*.log

instead of

paths:
  - /var/log/pods/*/*/*.log*

Hi @Tetiana_Kravchenko ,

Thanks for your fast response!

I probably should of mentioned, I also had an enterprise issue open and we are losing thousands of logs per minute for a certain application that is logging a lot. When checking metrics we noticed truncated files we're high amounts which I believed were caused by big amounts of data data being logged in peak load as it's fine overnight. Elastic support agreed we should either not rotate (Managed AKS makes this pretty hard to change), or try to scan both files at least. In light of these we had to change to /var/log/pods instead of /var/log/containers which was a syslink to a single file (didn't include rotated). But yes this latest challenge means kubernetes metadata only works for one file.

@Jayw77 thank you for clarifications.
I've opened this PR - Fix add_kubernetes_metadata matcher: support rotated logs when 'resource_type: pod' by tetianakravchenko · Pull Request #30720 · elastic/beats · GitHub.
I've also noticed that in your log message is used not correct pod uid

Index key debrief-test_debrief-eventhub-consumer-dws-12ffasfgag44-w9cvs_da5d did not match any of the cached resources

used debrief-test_debrief-eventhub-consumer-dws-12ffasfgag44-w9cvs_da5d instead of the actual pod uid da5d..., seems this condition was applied here - beats/matchers.go at 1d05ba86138cfc9a5ae5c0acc64a57b8d81678ff · elastic/beats · GitHub. I think it is a reason why you are getting Index key ... did not match any of the cached resources

Hi @Tetiana_Kravchenko,

Great, thanks for opening the PR.

I also switched to the below config in the meantime using auto discover and it appears to have resolved the issue too, however the PR may prevent somebody else in future tripping over this.

I only applied it to prod recently so will monitor it tomorrow to be sure.

I also noticed the PODUID wasn't extracted but I assumed that was because of that part of the code not running due to it not ending .log therefore not running the next steps to abstract it.

filebeat.yml

      filebeat.autodiscover:
        providers:
          - type: kubernetes
            node: ${NODE_NAME}
            include_pod_uid: true
            in_cluster: true
            hints.enabled: true
            hints.default_config:
              type: container
              scan_frequency: 1s
              exclude_files: ['.gz$']
              multiline.type: pattern
              multiline.pattern: '^[[:space:]]|^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
              multiline.negate: false
              multiline.match: after
              processors:
                - decode_json_fields:
                    when.regexp.message: '^{'
                    fields: ["message"]
                    process_array: false
                    max_depth: 1
                    target: "parsed"
                    overwrite_keys: false
                    add_error_key: true
                - add_labels:
                    labels:
                      k8s_cluster: "${cluster_name}"
                - drop_fields:
                    fields: ["container.id", "agent.id", "agent.ephemeral_id", "container.runtime", "ecs.version", "input.type", "kubernetes.labels.pod-template-hash", "kubernetes.pod.uid", "kubernetes.labels.app_kubernetes_io/config-sha256"]
                    ignore_missing: true
              paths:
                - /var/log/pods/*_${data.kubernetes.pod.uid}/$${data.kubernetes.container.name}/*.log*
      processors:
        - add_cloud_metadata: ~
        - add_kubernetes_metadata:
            in_cluster: true
      output.elasticsearch:
        hosts: ["${elastic_url}"]
        index: "k8s-logs-${unlabelled_name}-%%{[agent.version]}"
        ${indices}
        username: "${elastic_username}"
        password: "${elastic_password}"
        bulk_max_size: ${bulk_max_size}
        worker: ${workers} 
      queue.mem:
        events: ${queue_mem}
        flush:
          min_events: ${bulk_max_size}
      setup.template.name: "ds-k8s-logs-template"
      setup.template.pattern: "k8s-logs*"
      setup.ilm.enabled: false
      logging.level: ${logging_level}
      http.enabled: ${http_enabled}
      http.host: localhost
      http.port: 5066

Thanks for your help so far, much appreciated!

Hi @Jayw77 ,
Great to hear that you could solve the issue!
FYI: PR is already merged.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.