Using custom pipeline with filebeat module

DPattee · November 30, 2019, 6:45am

I've read a hundred pages of docs/forum posts/random blogs and am just as confused as I was to begin with - I took a break from this project for a few months, but nothing magically changed

Overall Scenario:

The filebeat osquery module has almost all values saved as strings.
I want the numeric types to actually be saved as numbers to enable better math and graph capabilities. (See post from July: OSQuery field types)
Ingest processors are the solution for converting fields/values
Due to how the osquery modules works, the fields that need to be converted can't be accessed by Processors defined within the filebeat module yml, they must be an elasticsearch pipeline (See Using a processor in a filebeat module may or may not actually find fields)

Problem:

I can not get a pipeline processor to actually apply to a message, it seems like it is just being totally ignored

Steps performed:

Ensured filebeat/elasticsearch/osquery were all talking to each other fine, data shows up as expected, default pipelines installed via setup --pipelines --modules osquery etc
Verified I could suggessfully add a field directly from the filebeat osquery module yml [my-osquery.yml]. This worked.
Removed that block from the my-osquery.yml
Created a custom pipeline that would add a field
Updated the main filebeat config [my-filebeat.config.yml] to point to my pipeline and to use a new index pattern
Stopped and restarted filebeat with that new config
Checked new data, did not see the additional field
Used Kibana console to run a POST _ingest/pipeline/test_osquery_fixer/_simulate with the _source from a real osquery message. Verified that the custom pipeline does work correctly in that mode.

Random things tried:

updated my pipeline to have a step where it uses the pipeline processor to run the original osquery pipeline too, just in case that was needed, but that didn't work
added a pipeline config line to the osquery module yml randomly, didn't expect that to work

Here is my pipeline:GET _ingest/pipeline/test_osquery_fixer

{
  "test_osquery_fixer" : {
    "description" : "testing pipeline",
    "processors" : [
      {
        "pipeline" : {
          "name" : "filebeat-7.4.2-osquery-result-pipeline"
        }
      },
      {
        "set" : {
          "field" : "custompipe",
          "value" : "testosqueryfixer",
          "ignore_failure" : false
        }
      }
    ]
  }
}

I added the pipeline line to my-filebeat-config.yml, didn't even try to do conditional-only-for-osquery-docs yet:

output.elasticsearch:
  hosts: ["asdf.local:9200"]
  pipeline: "test_osquery_fixer"

Even have the pipeline name in the modules.d/my-osquery.yml config...

# Module: osquery
- module: osquery
  result:
    enabled: true
    var.use_namespace: true
    input:
      pipeline: "test_osquery_fixer"

Final results are:

Under _simulate the pipeline works
For messages being received by filebeat the pipeline doesn't work

DPattee · November 30, 2019, 6:58am

Oh and with logging set to debug, I do see that the pipeline name that I specify isn't being used:

2019-11-29T22:53:06.582-0800	DEBUG	[processors]	processing/processors.go:183	Publish event: {
  "@timestamp": "2019-11-30T06:53:06.582Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.4.2",
    "pipeline": "filebeat-7.4.2-osquery-result-pipeline"
  },
  "ecs": {
    "version": "1.1.0"
  },
  "host": {

And after letting it run for a few minutes, the main logfile (/logs/filebeat) doesn't have the name of my custom pipeline anywhere in it, or any ERROR lines.

DPattee · December 4, 2019, 5:17am

In the mean time, I've just manually added a

  {
    "pipeline" : {
      "name" : "osquery-custom-branch-pipe"
    }
  }

at the end of the processor list in the auto-loaded/default osquery module pipeline. I'll have to remember to do that any time I update filebeat of course.

I think that adding a result.input.pipeline entry to the module's yml should be set to do the same thing that I did manually - run at the end of the default osquery pipeline automatically.

Kaiyan_Sheng · December 5, 2019, 10:50pm

What are the fields in which module are you trying to convert to numbers?
Also for pipeline, I believe you can delete the old pipeline and then restart filebeat in order to load the new pipeline. For example:

GET _ingest/pipeline
DELETE _ingest/pipeline/filebeat-8.0.0-aws-s3_server_access_log-pipeline

DPattee · December 8, 2019, 6:16pm

What are the fields in which module are you trying to convert to numbers?

Every result from the osquery module is processed as a string, so I'm converting all sorts of things from fan speed RPMs, to cpu core counts, to celsius temperature sensor readings back in to integers or doubles via custom pipeline.

system · January 5, 2020, 6:16pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat custom module pipeline failed Beats filebeat	5	511	October 20, 2020
Filebeat with customized config and ingest_pipeline Elastic Cloud on Kubernetes (ECK)	4	464	December 2, 2022
How to implement a custom pipeline to a index Beats filebeat	6	252	March 22, 2024
Filebeat default modules + custom log files Beats filebeat	2	368	January 7, 2019
Import yaml pipeline Beats filebeat	3	862	July 6, 2020

Using custom pipeline with filebeat module

Related topics