Filebeat Autodiscover Hints Breaking Template

Hi,

We're using the below to scrape Kubernetes logs based on the presence of a specific annotation:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          in_cluster: true
          tags:
            - "kubernetes"
          templates:
            - condition:
                contains:
                  kubernetes.annotations.logging: "true"
              config:
                - type: docker
                  tail_files: true
                  containers.ids:
                  - "${data.kubernetes.container.id}"
                  processors:
                  - add_kubernetes_metadata:
                      in_cluster: true
                  - add_cloud_metadata: ~

This on its own is working as expected, we get logs for pods that have the logging annotation on them, with cloud metadata attached to the event.

We thought we could add hints.enabled: true to the configuration that would then allow certain pods to specify custom parsing operations (i.e. to handle the presence of multiline logs). However when adding hints.enabled: true to the provider configuration (example below), suddenly everything from Kubernetes starts getting logged, and is also missing the cloud metadata:

        - type: kubernetes
          hints.enabled: true
          in_cluster: true

What we're after is the ability to send logs only from pods that have a specific annotation, and then provide a further option on top of that (using hints) to provide annotations to help better parse multiline messages etc. Is that possible?

Cheers,
Mike

1 Like

Hi @Evesy,

When you are using hints, all logs are retrieved by default, as you said. But you can change that behavior, as templates are evaluated first, for instance creating a template that throws no configs.

Also, there is a hint to disable logging, try: co.elastic.logs/disable: true. I'll have a look to the docs, we may need to include this one :innocent:

Best regards

Hey @exekias

Thanks very much for the quick response! I'm pretty new to using Filebeat, would you possible be able to clarify what you mean by this?

for instance creating a template that throws no configs

Are there any plans/would a feature request be welcomed to have an inverse of the current behaviour, with an option to default no log collection by hints, and only enabled when co.elastic.logs/enable: true is set?

Sorry, I was not clear enough :), I was thinking on a config like this:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          in_cluster: true
          hints.enabled: true
          templates:
            - condition:
                contains:
                  kubernetes.annotations.logging: "false"
              config:

As templates are processed first, if logging == false hints won't be processed. In practice this removes the need for adding it as a feature. As you can use these settings to get the behavior you want.

BTW: I just opened a PR for the docs: https://github.com/elastic/beats/pull/7406

@exekias Is this block in the example supposed to have a not statement somewhere:

            - condition:
                contains:
                  kubernetes.annotations.logging: "false"

As I understand it from that example, files will be processed for anything with logging == false

Apologies if I'm missing the point (quite likely), ideally I don't want to rely on everything having to include an annotation to be excluded from logging, I'd rather that be the default and there be an annotation to opt in

Thanks again
Mike

ok then, something like this would do (I believe, to be tested):

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          in_cluster: true
          hints.enabled: true
          templates:
            - condition:
                not.contains:
                  kubernetes.annotations.logging: "true"
              config:

@exekias As soon as I add hints.enabled: true I start seeing all files being processed. I've tried with both the below templates:

          templates:
            - condition:
                contains:
                  kubernetes.annotations.logging: "true"
              config:

(Only process logs with logging == true)

          templates:
            - condition:
                not.contains:
                  kubernetes.annotations.logging: "true"
              config:

(Only process logs where logging != true)

Uhm, I think you are right, after checking the code, hints will work as long as you don't provide a valid config. This is useful to override some behavior, but not to cancel it entirely.

In this case, I would be OK with accepting a feature or pull request to support what you need. Feel free to open a new issue, please give as much detail as possible: https://github.com/elastic/beats/issues

Best regards

Thank you for the help @exekias, it's much appreciated.

I have raised https://github.com/elastic/beats/issues/7407 in relation to this discussion

Thanks,
Mike

Hey @exekias,

I have managed to almost achieve what I was after. I've been able to use the hints autodiscover to only publish logs with a specific annotation, whilst also being able to make use of multiline hints:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          include_annotations: [logging']

    processors:
    - add_cloud_metadata: ~
    - drop_event:
        when:
          not:
            equals:
              kubernetes.annotations.logging: 'true'

The last thing I'm struggling to get working is tail_files within the Kubernetes autodiscovery (Since our filebeat instances are stateless). Currently when a filebeat restarts it ends up scraping logs that have already been scraped, resulting in duplicated messages being published.

Is there somewhere I can configure the hints discovery/kubernetes logger to only tail files?
Neither of the below configs appear to work:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          include_annotations: ['logging']
          config:
            - type: docker
              tail_files: true

    processors:
    - add_cloud_metadata: ~
    - drop_event:
        when:
          not:
            equals:
              kubernetes.annotations.logging: 'true'
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          include_annotations: ['logging']
          tail_files: true

    processors:
    - add_cloud_metadata: ~
    - drop_event:
        when:
          not:
            equals:
              kubernetes.annotations.logging: 'true'

Nice, I'm glad that did the trick.

You don't need to use tail_files. Filebeat can keep a registry of what's sent and what's pending, you just need to update this volume to use a hostPath: https://github.com/elastic/beats/blob/master/deploy/kubernetes/filebeat-kubernetes.yaml#L117-L121

Best regards

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.