Configuring multiple filebeat instances in Kubernetes

We're currently running a well-working setup for indexing nginx sidecar logs using filebeat. It consists of a filebeat container per node, and in our app deployments we have annotations such as

        co.elastic.logs.nginx/enabled: "true"

(we make sure to actually name the nginx sidecar "nginx").

This is the filebeat config we're running:

        - type: kubernetes
          hints.enabled: true
          hints.default_config.enabled: false
          include_annotations: ["logtype"]
          include_labels: ["app"]
          labels.dedot: true
          annotations.dedot: true
      - decode_json_fields:
          fields: ["message"]
          target: ""
          max_depth: 2
          overwrite_keys: false
      - add_kubernetes_metadata:
          in_cluster: true

Also worth mentioning we have a custom json-formatted logger in our nginx image, not using any of filebeat's builtin modules. We like to have full control ourselves. So far, all of this works really well.

The challenge: We'd like a similar setup to be able to index application logs from the "main" container. These will also probably be json-formatted, but we'd like some flexibility here so I'm trying to figure out what the best option will be.

As far as I can see, the kubernetes filebeat autodiscover setup doesn't really support running multiple filebeat containers with different config, because there's no way to direct a container's log file to a specific filebeat instance. Data from filebeat ends up in logstash, where each pipeline is published on an isolated port. Again, we do this to avoid large and hard-to-understand pipeline code blocks and it also allows is to decouple or pipelines later if needed.

So, in essense we need two different filebeat instances to push data to 2 completely separate logstash pipelines. I'm struggling to figure out how to do this, without running a filebeat sidecar alongside each kubernetes pod, which would work but with unneseccary overhead.

The only thing I can think of is to use the "processors" directive in such a way that the "nginx filebeat instance" drops all messages NOT from a container named "nginx", and the "app logs instance" drops all messages NOT from a container named "app". But this seems suboptimal, especially since we'd have to parse potentially large amounts of data twice, only to discard much of it.

Any hints appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.