I have a use case where I would like to assign a different type of log input from filebeat to its own pipeline. The hope is that I can use filebeat to 1) assign the log message to the appropriate pipeline and 2) to insert a field tag to each log record that identifies the log type.
So lets say I have 3 log types: typeA, typeB, typeC. When filebeat recognizes an update to log type A, it reads the log file, assigns each log message to pipelineA, and appends a typeA value to each message before output the message to pipelineA.
The documentation I find shows that pipelines are defined in logstash (pipelines.yml). Presumably filebeat uses that definition from the logstash configuration to perform the pipeline association in the filebeat.yml or elsewhere. I think I have a decent idea how to created the pipeline definition at the logstash level, but I don't know how to configure what I want to do above at the filebeat level.
Filebeat does not select pipelines in Logstash. In Logstash you can configure multiple pipelines, each with it's own set of inputs, filters, and outputs. Logstash also supports forwarding events from one pipeline to another via special output and input plugins.
Only one pipeline can have the Beats input (binding a port). So you will need 4 pipelines in LS. One for accepting and filtering/forwarding events. And one pipeline per log-type.
Depending on complexity/processing requirements you might not need multiple pipelines, though.
Having 3 log types I presume that you have 3 different prospectors configured. You can add custom tags/fields to each prospector. These tags/fields can be used in Logstash to select one or the other pipeline.
In logstash you can just filter for if [@metadata][pipeline] == "pipelineA" { ... }. No extra filtering required. The elasticsearch output removes all @metadata fields when indexing. You can use if conditions in logstash filters or the output section for custom filtering or customized routing (to different output or another logstash pipeline).
In logstash you can have multiple configuration files per pipeline. Having a conf file per pipeline setting with filter and conditional as guard should be enough. Then you don't need actual logstash pipelines (well, it depends).
e.g.
pipelineA.conf:
filter {
if [@metadata][pipeline] == "pipelineA" {
...
}
}
pipelineB.conf:
filter {
if [@metadata][pipeline] == "pipelineA" {
...
}
}
You can add outputs to the same file or some other file (using the same trick). Note: this is not using logstash pipelines. Logstash pipelines is about redistributing/isolating work, which is needed for complex scenarios only, but complicates setup and tuning.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.