How to add an additional ingest pipeline to filebeat when using a module

I have apache logs, so I use the apache module in filebeat. However, I also want to apply an ingest pipeline I have made that adds an email field, but apparently the output.elasticsearch.pipeline option doesn't work when you are using a module. To be clear, I want to apply my pipeline after it has been parsed by the default apache ingest pipeline. How would I go about doing this? I have tried the input.pipeline option in the module config, and that doesn't work.

Hi @Zack1 Welcome to the community!

So you want to add addition processing to a module, and awesome question... I think can help... its a pretty common ask ... but the solution is not obvious.

First of all what version are you on? I am doing this example on 8.2.3 but 7.x should mostly be the same

2nd...

Close but not exactly... to overwrite the pipeline with a module you will need to overwrite as an input parameter.

Unfortunately that is not really clear as that is kind of in 2 separate places in the doc.

Override Input Settings and and example of how it works here this is the log input which I believe the apache module uses under the covers.

The important part of the verbiage being...

pipeline

The ingest pipeline ID to set for the events generated by this input.

The pipeline ID can also be configured in the Elasticsearch output, but this option usually results in simpler configuration files. If the pipeline is configured both in the input and output, the option from the input is used.

This is what is happening the pipeline is defined on the input so it overrides the elasticsearch.output configuration, that is what is happening.

Ok so now we have the first part of the answer, we need to define your new pipeline in the input section.

2nd what we will want to do is "compose" a top-level pipleline that first calls the default apache pipeline and the calls your added details pipeline.. I like this method because the Top Level pipeline will not have any actual processing in it.

So here is is all togther

modules.d/apache.yml

# Module: apache
# Docs: https://www.elastic.co/guide/en/beats/filebeat/8.2/filebeat-module-apache.html

- module: apache
  # Access logs
  access:
    enabled: true
    # Set my new pipeline, overrides the default
    input:
      pipeline: my-apache-access-pipeline-main

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/Users/sbrown/workspace/sample-data/apache/apache-small.log"]

  # Error logs
  error:
    enabled: false

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

Now in Kibana-Dev Tools

Run this and find the filebeat pipeline
GET _ingest/pipeline

On 8.2.3 it will look like

{
  "filebeat-8.2.3-apache-access-pipeline" : {
    "on_failure" : [

So now Create your Top Level Pipeline that calls that and your details pipeliene

DELETE _ingest/pipeline/my-apache-access-pipeline-main

GET _ingest/pipeline/my-apache-access-pipeline-main

PUT _ingest/pipeline/my-apache-access-pipeline-main
{
  "description": "My Top Level Apache Pipeline",
  "processors": [
    {
      "pipeline": {
        "description" : "Call the default apache access pipeline",
        "name": "filebeat-8.2.3-apache-access-pipeline"
      }
    },
    {
    "pipeline": {
      "description" : "Call my additional apache access pipeline",
      "name": "my-apache-access-pipeline-details"
    }
    }
  ]
}

and then create your "details" pipeline to do what you want

DELETE _ingest/pipeline/my-apache-access-pipeline-details

GET _ingest/pipeline/my-apache-access-pipeline-details

PUT _ingest/pipeline/my-apache-access-pipeline-details
{
  "processors": [
    {
      "set": {
        "field": "my-new-field",
        "value": "my-value"
      }
    }
  ]
}

Now I ran this .. and I got the full apache parsing and fields plus my additional field.

Hope this helps.

BTW there are other ways to do these things will say a runtime field but I answered using the pipeline method as that is what you asked and it is a solid solution. You will need to update that Top Level pipeline when you update filebeat to the latest pipeline etc..

Hi @stephenb, thanks for the quick reply. That looks like it will do what I want, however, do you know if there is a more programmatic way to get the name of the default apache pipeline? My current setup is with docker, and I'm trying to create python scripts that set all of this up automatically, i.e. without going into Kibana to look at what the pipeline is called.

Well pretty sure it has been this naming convention for a long time for apache module.

<beatname>-<version>-<module>-<fileset>-pipeline

BUT that may not be sure for every module / fileset...

The easiest way is to create your new ingest pipeline and configure it in the index.final_pipeline setting for your index, this way you do not need to make any changes in the filebeat module.

The pipeline configured in this setting will be executed after the pipeline that the module is using.

Depending on how you are indexing, you will probably need to change the template for your index to add this setting to new index.

1 Like

Yup that is a great another way.. but sometimes on the newer modules and index templates index.final_pipeline may be already set.

So @Zack1 you have 2 ways to do it!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.