Filebeat is overwriting the pipeline specified in Elastic on start

I am using Filebeat to collect CloudWatch logs and I have modified the ingest node pipeline to extract and index some more information from the logs. However, when Filebeat has restarted the extra processors that I added disappear and it seems the whole pipeline is overwritten. Is there a way to ensure the pipeline isn't altered when starting Filebeat?

I have also observed that when I specify a pipeline in filebeat.yml, Filebeat seems to ignore this and use the default. I define the pipeline as shown below.

I am using Filebeat to collect CloudWatch logs and I have modified the ingest node pipeline to extract and index some more information from the logs. However, when Filebeat has restarted the extra processors that I added disappear and it seems the whole pipeline is overwritten. Is there a way to ensure the pipeline isn't altered when starting Filebeat?

I have also observed that when I specify a pipeline in filebeat.yml, Filebeat seems to ignore this and use the default. I define the pipeline as shown below.

output.elasticsearch:
    hosts: ["127.0.0.1:9243"]
    pipeline: "filebeat-7.11.0-aws-cloudtrail-pipeline-test"

Hi @JamblaInc Welcome to the community.

There is a little subtle magic to this I think... the default pipeline is used and overrides what you are specifying in the elasticsearch output, I believe you will need to define it in the input sections

So first are you using the the AWS Module and / or the AWS Cloudwatch Input or both?

If you are using the AWS CloudWatch input you would specify it there

See pipeline ... and I think the magic there is actually a default value for it and then if you read the little section below it says

The pipeline ID can also be configured in the Elasticsearch output, but this option usually results in simpler configuration files. If the pipeline is configured both in the input and output, the option from the input is used.

Which means the default pipeline will always override the output section.

This may not be it but take a look let me us know...

There was a similar discuss a while back...

Thanks for the reply, I am using the AWS Module but I don't see an option to define a pipeline in there. So where exactly do I define it?

Can you share your sanitized config? It looks like it is missing in the docs or we might need to do something else. It is still beta

I think under the same level as enabled, like this

- module: aws
  cloudtrail:
    enabled: false
    input:
      pipeline: my-pipeline

This didn't work either, it's still loading the default pipeline.

My aws.yml:

- module: aws
  cloudtrail:
    enabled: true
    var.queue_url: URL
    input:
      pipeline: filebeat-7.11.0-aws-cloudtrail-pipeline-test

Apologies I don't have a direct answer but... Hmmm.... you up for a little debugging unfortunately I don't have an aws test bed handy

So lets take a look at with a little more verbose logging.

You can run filebeat in the foreground with these parameters, warning this will be quite verbose I would only have like cloudtrail enable so we can cut it down ... let it run untill you see it processing messages then kill it.

Run filebeat in the foreground When you run this it will dump a lot of config... if you have

filebeat -e -d "*" or ./filebeat -e -d "*" depending on how you installed you

It will tell you it is loaded the default piplines with log lines like this.. don't let that distract you, it will always say that.

2021-03-12T07:32:17.519-0800    DEBUG   [esclientleg]   eslegclient/connection.go:364   GET http://localhost:9200/_nodes/ingest  <nil>
2021-03-12T07:32:17.524-0800    DEBUG   [esclientleg]   eslegclient/connection.go:364   GET http://localhost:9200/_ingest/pipeline/filebeat-7.11.1-nginx-access-pipeline  <nil>
2021-03-12T07:32:17.527-0800    DEBUG   [modules]       fileset/pipelines.go:120        Pipeline filebeat-7.11.1-nginx-access-pipeline already loaded
2021-03-12T07:32:17.527-0800    DEBUG   [modules]       fileset/pipelines.go:67 Required processors: []
2021-03-12T07:32:17.527-0800    DEBUG   [esclientleg]   eslegclient/connection.go:364   GET http://localhost:9200/_ingest/pipeline/filebeat-7.11.1-nginx-error-pipeline  <nil>
2021-03-12T07:32:17.531-0800    DEBUG   [modules]       fileset/pipelines.go:120        Pipeline filebeat-7.11.1-nginx-error-pipeline already loaded

But what we want to see is a couple of the messages...

The header of each should look like this, we want to see what that pipeline value is.

2021-03-12T07:35:55.708-0800    DEBUG   [processors]    processing/processors.go:203    Publish event: {
  "@timestamp": "2021-03-12T15:35:55.708Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.11.1",
    "pipeline": "my-pipeline" <------- This Value 
  },
  "log": {
    "offset": 790,
    "file": {
      "path": "/Users/sbrown/workspace/sample-data/nginx/nginx-5rows.log"
    }
  },

Let me know what you see.

So this is showing the correct pipeline. I had a look at the events and it seems they are now being properly processed. I think the previous step may have fixed it.

Thank you for resolving this, it seems the only outstanding issue is if the default pipeline is used it is overwritten every time filebeat is restarted.

Good to hear,

Yes if you edit the default pipeline that will happen... I do think there is a way to stop that as well with setting managing the template to false (or some other setting I would need to check, and that might cause other unintended consequences) , but editing the default pipeline is probably not the best practice as there is a lot of logic to get modules back to a working state / default state. After all, by definition it is the default pipeline is just that, what you created is a custom pipeline. :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.