Where is add_kubernetes_metadata documented?

At

https://www.elastic.co/guide/en/beats/filebeat/current/add-kubernetes-metadata.html

there is an example of "default indexers and matchers disabled and enables ones that the user is interested in" but no links that I can see to anything that defines how you configure the add_kubernetes_metadata processor.

The use case is that different pods/containers are going to output logs in different formats. For which there are different grok parsers in Logstash. So I need to add a custom field in the K8s deployment of Filebeat, just as I do in the standalone deployment, to tell Logstash which format a particular log line is in. Which I guess I want to do by label, or something like that? And there's no point in ingesting logs from applications I don't understand and can't parse, so I'll want to process data only from known containers/pods/labels/whatever.

How do I do this please?

Hi @TimWard,

There is an ongoing effort to improve this documentation: https://github.com/elastic/beats/issues/5566

As you guessed, what you want is to include some labels or annotations along with the rest of metadata added by add_kubernetes_metadata. You can achieve that by using these settings:

processors:
  - add_kubernetes_metadata:
      include_annotations:
        - annotation_to_include

All labels are included by default

If all labels are included by default that sounds like a good start. The questions are then:

(1) How do I tell Filebeat not to process logs from pods that don't have labels I'm interested in? Eg "if there is no label called log_type do not process events from this pod".

(2) How do I set a custom field in the event depending on one of the labels? Eg "if there is a label called log_type in this pod, add a custom field log_type to the events from this pod".

The point of (2) being that Logstash looks for particular values of a custom field in order to decide which grok parser to use and which Elasticsearch index to send the output to.

Yes I have come to realise that I could do all this in Logstash by looking for the label metadata, but there are two problems with that approach:

(a) vast amounts of resource could be wasted prospecting pods I'm not interested in, shipping their events to Logstash, and then throwing them away.

(b) The Logstash code would then need to know about K8s deployment, and I think it would be cleaner if it didn't need different code depending on whether a particular application was or wasn't deployed in K8s.

Thank you for the explanation, it helps when we can understand your exact use case, it also helps to shape our roadmap :slight_smile:

  1. You can use autodiscover for this, check: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html

For instance:

filebeat.autodiscover:
  providers:
    - kubernetes
      templates:
        - condition:
            regexp:
              kubernetes.labels.yourlabel: '.*'
          config:
            - type: docker
              containers.ids:
                - "${data.kubernetes.container.id}"
              # Point 2, add custom fields to events:
              fields:
                yourdesiredfield: "${data.kubernentes.labels.yourlabel}"

Also, 6.3 will bring a nice feature, allow you to define this behavior through kubernetes annotations, stay tunned for its release :slight_smile: https://github.com/elastic/beats/pull/7228

Best regards

Thanks very much - so many moving parts, so many new features :grinning::+1:

(I'll close this when I get it working, which may or may not be today.)

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.