Add pod-uid support for add_kubernetes_metadata matchers


(Mario Mechoulam) #1

Hi everyone,

I would like to understand if it would be possible to enhance the current add_kubernetes_metadata mechanism so that it could fetch the metadata starting from a pod-uid instead.

In our current scenario we have a Kubernetes cluster with several pods on different nodes, using Filebeat to stream the logs to an Elasticsearch host. The applications write to a few different file logs and, because of this and some other reasons, we are not simply using the stdout and stderr outputs for Filebeat. Instead, we have created and mounted volumes on the host file system and mounted these volumes on the Filebeat pod so it can read and send them.
This works fine except for the fact that all events appear as originating from the Filebeat pod and we have lost all the Kubernetes metadata that normally gets appended.

The pods log volumes are mounted in

/var/lib/kubelet/pods/<pod-uid>/volumes/kubernetes.io~empty-dir/<volume-name>

The filebeat configuration

apiVersion: v1
data:
  filebeat.yml: |-
    filebeat.config:
      prospectors:
        # Mounted `filebeat-prospectors` configmap:
        path: ${path.config}/prospectors.d/*.yml
        # Reload prospectors configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    processors:
      - add_cloud_metadata:
      - add_kubernetes_metadata:
          in_cluster: true

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST}:${ELASTICSEARCH_PORT}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
      index: "filebeat-%{[beat.version]}-logs-%{+xxxx.ww}"

    setup.template.name: "filebeat-%{[beat.version]}"
    setup.template.pattern: "filebeat-%{[beat.version]}-*"
kind: ConfigMap

apiVersion: v1
data:
  kubernetes.yml: |-
    - type: log
      paths:
        - /var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/applogs/*.log
      exclude_files: ['\.gz$', 'gc.log']
kind: ConfigMap

A sample event has the following source

/var/lib/kubelet/pods/005f3b90-4b9d-12f8-acf0-31020a840133/volumes/kubernetes.io~empty-dir/applogs/server.log

The current code tries to extract the container id either from /var/lib/docker/container or from /var/log/containers, so unless I am neglecting some configuration it won't work.

Perhaps there are a few ways to enhance the logic to help in such a situation:

  • Use the pod-uid to find out the related /var/lib/docker/container path and hook this logic before Matcher kicks in.

  • Create a second type of Matcher and use the configuration file to specify options and paths for it

  • Modify the add_kubernetes_metadata logic to work with a pod-uid too

All options would need to take into consideration the possibility of multiple containers per pod (perhaps by mounting the log volumes in different subpaths).

I'd be more than happy to work on this and create a PR, but first want to understand the different points of view and the recommendations, if any.

Cheers!


(Carlos PĂ©rez Aradros) #2

Hi @mariomechoulam,

This proposal looks good to me, I like the use case. Could you please open a new enhancement request so we can track interest and work on it? https://github.com/elastic/beats/issues/new

Best regards


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.