Filebeat adding incorrect kubernetes metadata to events going to logstash

Greetings,

We have filebeat deployed to a kubernetes cluster (our QA instance).
Filebeat is picking up logs for the containers and sending them to our logstash server, which ends up in our elasticsearch. However, the events are not getting the correct kubernetes metadata added. Our setup has worked well for a couple of years, but after a restart of filebeat , logging is not coming in as expected to our logstash.

My team's can no longer search for their events in kibana, due to messages being associated with a different kubernetes pod, container, deployment, etc.

Scenario that occurs for my pods...
In a published event, the wrong kubernetes data was added . Instead of getting the kuberentes data for my running pod: or host: dsp-spring-boot-pr-1538-generic-http-server-78c4f9757-mq7kt , It added the metadata from the pod or host: snoop-doghouse-pr-4989-generic-http-server-76d9dfc678-6swzg, which is another pod on the same kubernetes node, but logs to a different file.

I've tried multiple config adjustments, but here is what I have and used to work fine:

filebeat.yml

      filebeat.autodiscover:
          providers:
            - type: kubernetes
              hints.enabled: true
              hints.default_config:
                type: container
                fields:
                  env: "qa"
                paths:
                  - /var/log/containers/*.log
                multiline.pattern: '^[[:space:]]'
                multiline.match: after
                multiline.negate: false
                ignore_older:    3h        # ignore files that were modified 3 hours ago
                clean_inactive:  4h        # removes the state of a file after 4h (it must be > ignore_older + scan_frequency)
                close_inactive:  1m        # close file handle after 1 minute of inactivity (file can be reopened after scan_frequency)
                scan_frequency:  1m        # check for new files every 1 minute
                harvester_limit: 50        # limit the number of harvesters to 50 at a time
        filebeat.config:
          modules:
            path: /etc/filebeat/modules.d/*.yml
            reload.enabled: false
          inputs:
            enabled: true
            path: /etc/filebeat/*.yml
            reload.enabled: true
            reload.period: 10s
        logging.level: debug
        processors:
        - add_cloud_metadata: null
        - add_host_metadata:
            netinfo.enabled: true
        - add_kubernetes_metadata:
            host: $${NODE_NAME}
            in_cluster: true
            labels.dedot: true # https://github.com/elastic/beats/issues/8773
            annotations.dedot: true
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
        - drop_event:
            when:
              equals:
                kubernetes.container.name: filebeat
        - drop_event:
            when:
              and:
                - equals:
                    kubernetes.namespace: jenkins
                - equals:
                    fields.env: qa
        http.enabled: true
        http.port: 5066
        output:
          elasticsearch:
            hosts:
            - http://elasticsearch-master:9200
            enabled: false
          file:
            enabled: false
          logstash:
            enabled: true
            hosts: "logstash-0.mkeymgmt.com:5044"
            ssl:
              certificate_authorities:
              - "/tmp/filebeat/root-ca.pem"
              certificate: "/tmp/filebeat/client.pem"
              key: "/tmp/filebeat/client.key"
              verification_mode: none
              supported_protocols:
              - TLSv1.2
        output.file:
          filename: filebeat
          number_of_files: 5
          path: /usr/share/filebeat/data
          rotate_every_kb: 10000

Log event from the pod:

{"log":"{\"@timestamp\":\"2024-02-29T21:07:47.453Z\",\"severity\":\"INFO\",\"application\":\"DSP\",\"trace\":\"\",\"span\":\"\",\"correlation_id\":\"\",\"pid\":\"7\",\"thread\":\"http-nio-8080-exec-8\",\"class\":\"c.l.dsp.utils.MetricsLogInterceptor\",\"message\":\"Endpoint=http://localhost:35481/v1/offerings, Execution-time-ms=7, Remote-ip=127.0.0.1, Forwarded-For=, Response-Code=200\",\"x_forwarded_for\":\"\",\"incap_client_ip\":\"\",\"host\":\"dsp-spring-boot-pr-1538-generic-http-server-78c4f9757-mq7kt\"}\n","stream":"stdout","time":"2024-02-29T21:07:47.453424719Z"}

however, I could not find this event in my dashbaord based on what I usually do (I search for records by pod name)

When looking at the published event in logstash, there is a mismatch between from which log file the event came, and the kubernetes info reflected.
example, I see my event is correct, but from this excerpt, I am showing the data came in for a different pod

"kubernetes": {
    "deployment": {
      "name": "snoop-doghouse-pr-4989-generic-http-server"
    },
    "namespace_uid": "94c1c379-7bc2-4937-951a-7bfb41ce84f8",
    "namespace": "greenhouse-pr-4989",
    "namespace_labels": {
      "pullRequest": "4989",
      "gitBranch": "SNAP-2890",
      "deleteStrategy": "branch-delete",
      "repoName": "greenhouse",
      "kubernetes_io/metadata_name": "doghouse-pr-4989"
    },
    "labels": {
      "release": "snoop-doghouse-pr-4989",
      "app": "generic-http-server",
      "gitSha": "a0200de9ec1e2d99630d968337a28e5c810db663",
      "pod-template-hash": "76d9dfc678"
    },

Any guidance would be greatly appreciated!

Thanks

Mike,
To address the issue of Filebeat adding incorrect Kubernetes metadata to logs, ensure Filebeat's service account has proper permissions, double-check the autodiscover and processors configuration in your filebeat.yml for accuracy, especially the host and logs_path settings. Also, verify that your Filebeat version is compatible with your Kubernetes cluster version. Use debug logging to gain insights into the metadata attachment process. If the problem persists, consider consulting Elastic's support or community forums for additional help.

Tim,
thanks for the guidance.
I have run filebeat in debug mode, and can see that for the log events. Basically the only thing correctly matching the pod I am working on is the actual message, and the log path. Almost all other kubernetes details are for a different pod.

For testing further, I decided to disable the add_kubernetes_metadata processor. Running with this method, file beat still adds the incorrect data.

Debug log event for a request to my application:

The kubernetes node info is as I expect, so is the log.file.path and the message content.

{"log.level":"debug","@timestamp":"2024-03-04T18:25:10.184Z","log.logger":"processors","log.origin":{"file.name":"processing/processors.go","file.line":212},"message":"Publish event: {\n  \"@timestamp\": \"2024-03-04T18:25:02.758Z\",\n  \"@metadata\": {\n    \"beat\": \"filebeat\",\n    \"type\": \"_doc\",\n    \"version\": \"8.7.0\"\n  },\n  \"agent\": {\n    \"ephemeral_id\": \"6c727972-a691-44ef-93fa-a9b0c6b2f463\",\n    \"id\": \"cf4a0c6d-f527-4d49-bf40-be329ce3ea72\",\n    \"name\": \"filebeat-qa-use1-eks-0-filebeat-pdbfx\",\n    \"type\": \"filebeat\",\n    \"version\": \"8.7.0\"\n  },\n  \"ecs\": {\n    \"version\": \"8.0.0\"\n  },\n  \"host\": {\n    \"name\": \"filebeat-qa-use1-eks-0-filebeat-pdbfx\",\n    \"hostname\": \"filebeat-qa-use1-eks-0-filebeat-pdbfx\",\n    \"architecture\": \"x86_64\",\n    \"os\": {\n      \"version\": \"20.04.5 LTS (Focal Fossa)\",\n      \"family\": \"debian\",\n      \"name\": \"Ubuntu\",\n      \"kernel\": \"5.4.249-163.359.amzn2.x86_64\",\n      \"codename\": \"focal\",\n      \"type\": \"linux\",\n      \"platform\": \"ubuntu\"\n    },\n    \"containerized\": true,\n    \"ip\": [\n      \"172.21.18.121\"\n    ],\n    \"mac\": [\n      \"46-F6-2B-CC-4F-B2\"\n    ]\n  },\n  \"cloud\": {\n    \"service\": {\n      \"name\": \"EC2\"\n    },\n    \"account\": {\n      \"id\": \"013528166951\"\n    },\n    \"image\": {\n      \"id\": \"ami-0f070f98721222171\"\n    },\n    \"instance\": {\n      \"id\": \"i-0f411bb58364dd339\"\n    },\n    \"provider\": \"aws\",\n    \"machine\": {\n      \"type\": \"m5.4xlarge\"\n    },\n    \"region\": \"us-east-1\",\n    \"availability_zone\": \"us-east-1c\"\n  },\n  \"log\": {\n    \"offset\": 478697,\n    \"file\": {\n      \"path\": \"/var/log/containers/dsp-spring-boot-pr-1553-generic-http-server-56d7f9f479-nhdw8_dsp-spring-boot-pr-1553_dsp-spring-boot-pr-1553-507d2b474a7c2f0edfab0609e2a792624237ae3e700d98032c3618d10e07fa1f.log\"\n    }\n  },\n  \"stream\": \"stdout\",\n  \"kubernetes\": {\n    \"namespace_uid\": \"bde1bd9a-0231-4f91-839d-00370b9a1408\",\n    \"deployment\": {\n      \"name\": \"gh-doghouse-pr-4986-generic-http-server\"\n    },\n    \"namespace\": \"doghouse-pr-4986\",\n    \"container\": {\n      \"name\": \"gh-doghouse-pr-4986-init\"\n    },\n    \"node\": {\n      \"uid\": \"359c34da-29b3-42b7-a761-c0cb71f9e957\",\n      \"labels\": {\n        \"topology_kubernetes_io/region\": \"us-east-1\",\n        \"kubernetes_io/os\": \"linux\",\n        \"k8s_io/cloud-provider-aws\": \"be15df46fac02ffc71d649ee55d42db3\",\n        \"node_kubernetes_io/instance-type\": \"m5.4xlarge\",\n        \"topology_kubernetes_io/zone\": \"us-east-1c\",\n        \"kubernetes_io/hostname\": \"ip-172-21-18-252.ec2.internal\",\n        \"beta_kubernetes_io/arch\": \"amd64\",\n        \"failure-domain_beta_kubernetes_io/region\": \"us-east-1\",\n        \"beta_kubernetes_io/os\": \"linux\",\n        \"kubernetes_io/arch\": \"amd64\",\n        \"failure-domain_beta_kubernetes_io/zone\": \"us-east-1c\",\n        \"topology_ebs_csi_aws_com/zone\": \"us-east-1c\",\n        \"beta_kubernetes_io/instance-type\": \"m5.4xlarge\"\n      },\n      \"hostname\": \"ip-172-21-18-252.ec2.internal\",\n      \"name\": \"ip-172-21-18-252.ec2.internal\"\n    },\n    \"namespace_labels\": {\n      \"deleteStrategy\": \"branch-delete\",\n      \"gitBranch\": \"NO_TICKET_test_verify_pr_479\",\n      \"kubernetes_io/metadata_name\": \"doghouse-pr-4986\",\n      \"pullRequest\": \"4986\",\n      \"repoName\": \"doghouse\"\n    },\n    \"labels\": {\n      \"gitSha\": \"f159d2586a51e9d8a6fe80d42c3e6fabb50784e0\",\n      \"pod-template-hash\": \"7bfbf75cb7\",\n      \"release\": \"gh-doghouse-pr-4986\",\n      \"app\": \"generic-http-server\"\n    },\n    \"pod\": {\n      \"ip\": \"172.21.16.188\",\n      \"name\": \"gh-doghouse-pr-4986-generic-http-server-7bfbf75cb7-ff6dk\",\n      \"uid\": \"d93dfa9f-d3bd-4321-9003-fa5e3253347f\"\n    },\n    \"replicaset\": {\n      \"name\": \"gh-doghouse-pr-4986-generic-http-server-7bfbf75cb7\"\n    }\n  },\n  \"container\": {\n    \"runtime\": \"docker\",\n    \"image\": {\n      \"name\": \"docker.artifactory.lkeymgmt.com/lk-gh:pr-4986\"\n    },\n    \"id\": \"4e72aee018d0a979e4a5307e6b4be150718abc224941e092804ae53fa8973473\"\n  },\n  \"message\": \"{\\\"@timestamp\\\":\\\"2024-03-04T18:25:02.758Z\\\",\\\"severity\\\":\\\"INFO\\\",\\\"application\\\":\\\"DSP\\\",\\\"trace\\\":\\\"\\\",\\\"span\\\":\\\"\\\",\\\"correlation_id\\\":\\\"\\\",\\\"pid\\\":\\\"7\\\",\\\"thread\\\":\\\"http-nio-8080-exec-7\\\",\\\"class\\\":\\\"c.l.dsp.utils.MetricsLogInterceptor\\\",\\\"message\\\":\\\"Endpoint=http://localhost:39661/v1/offerings, Execution-time-ms=5, Remote-ip=127.0.0.1, Forwarded-For=, Response-Code=200\\\",\\\"x_forwarded_for\\\":\\\"\\\",\\\"incap_client_ip\\\":\\\"\\\",\\\"host\\\":\\\"dsp-spring-boot-pr-1553-generic-http-server-56d7f9f479-nhdw8\\\"}\",\n  \"input\": {\n    \"type\": \"container\"\n  },\n  \"fields\": {\n    \"env\": \"qa\"\n  }\n}","service.name":"filebeat","ecs.version":"1.6.0"}

This same exact setup has served well for years, and it operates adding the correct data on other clusters in my environment.

kubernetes v 1.23
filebeat v 8.7.0

Thanks,
Mike

I have been testing with "add_kubernetes_metadata" disabled, then just tried using it again.

It seems somehow for my cluster, kuberentes metadata is present in the events before they reach filebeat (based on the debug message below), however, I cannot yet tell from where the metadata comes. I definitely don't have it in the application logs.

{
  "log.level": "debug",
  "@timestamp": "2024-03-05T15:08:08.057Z",
  "log.logger": "kubernetes",
  "log.origin": {
    "file.name": "add_kubernetes_metadata/kubernetes.go",
    "file.line": 266
  },
  "message": "Skipping add_kubernetes_metadata processor as kubernetes metadata already exist",
  "service.name": "filebeat",
  "libbeat.processor": "add_kubernetes_metadata",
  "ecs.version": "1.6.0"
}

Greetings,
Issue workaround implemented.
With the given information that "kubernetes metadata already exist", I tried to track down how the data was getting injected. I ultimately could not figure out what is adding this data, so I decided to change my filebeat.yml to drop any kubernetes fields, then use the "add_kubernetes_metadata" processor, with the hopes that this will associate the message event correctly with the kuberntes_metadata. To my delight, it worked.

The key to the config is the "drop_fields" on "kubernetes" field, as you can see from excerpt:

        processors:
        #Drop kubernetes events, so we can let the add_kubernetes_metadata processor add it.
        - drop_fields:
            fields: ["kubernetes"]
        - add_cloud_metadata: null
        # Add host metadata including network information
        - add_host_metadata:
            netinfo.enabled: true
        # Add Kubernetes metadata to events
        - add_kubernetes_metadata:
            host: $${NODE_NAME}
            in_cluster: true
            labels.dedot: true # https://github.com/elastic/beats/issues/8773
            annotations.dedot: true
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"

Thanks,
Mike

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.