[Filebeat] Define custom Ingest Node for Kafka Output

Good afternoon guys!

I have a question regarding the configuration of the pipelines and Filebeat, when we're using a Kafka as output:

When we have ElasticSearch as output, we can just define the pipeline we want to use to process the data under output.elasticsearch.pipeline, for example:

output.elasticsearch:
  hosts: ["localhost:9200"]
  pipeline: my_pipeline_id

My question is regarding how do I define the Pipeline that I want to use in case that I set up Kafka as output?

As far as it indicates the documentation, Kafka doesn't have the pipeline attribute, but I know that in case that we use modules, even using Kafka as output, we can indicate ES that we want it to use the module pipeline for processing the data, so I want to do something similar, but with custom Ingest Nodes.

Also, I'd like to setup a custom pipeline depending of the container image, using the autodiscovery, so I was thinking in something like:

autodiscover.providers:
    - type: docker
      templates:
        - condition:
            contains:
              docker.container.image: custom_image
          config:
            - pipeline: my_custom_pipeline
              log:
                input:
                  type: container
                  paths:
                    - /var/lib/docker/containers/${data.docker.container.id}/*.log

But in case that this is not possible, I could set up different instances of Filebeat and each one using a different custom Ingest Node, so the priority would be to be able to setup this pipeline attribute in a Kafka output.

Does anyone know if something like that is possible?

Did you consider adding the logstash between filebeat and Kafka? You could use logstash to process data (run pipelines).

I have LogStash, but it's between Kafka and ElasticSearch.

Basically, I have the following architecture right now in my ELK Stack:


I've though about to add a custom field in the autodiscovery input, taking advantage of the condition of the autodiscovery, and tell LogStash to pass it to ES.

Filebeat.yml

autodiscover.providers:
    - type: docker
      templates:
        - condition:
            contains:
              docker.container.image: custom_image
          config:
            - type: container
              paths:
                 - /var/lib/docker/containers/${data.docker.container.id}/*.log
              processors:
                - add_fields:
                    fields:
                      pipeline: 'my_custom_pipeline'

LogStash Configuration

output {
  if [fields][pipeline] {
    elasticsearch {
      pipeline => "%{[fields][pipeline]}"
    }
  }
}

Do you think that would work?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.