Two logstash instances with same file input

Hi,
I would like to run two different logstash processes with two different pipelines, same input file, but different sincedb path

So ingenersal two different processes running different pipelines. but in the pipeline definition input, i use the same file.
Can this be a problem ?

Pipeline1:

input {
file {
codec => "json"
path => "/tmp/mydata.log"
sincedb_path => "/tmp/mydata_sincedb"
}
}

Pipeline2:

input {
file {
codec => "json"
path => "/tmp/mydata.log"
sincedb_path => "/tmp/mydata2_sincedb"
}
}

I didn't try but it should work. I don't see any issue as well as the logic to read twice the same file :slight_smile:
Maybe is not good idea to have in the same index, you will have duplicated data. Or if you like to read a file and write to 2 indices, then set 2 ES index in the output.

Thank you!
I also don't see an issue, but i want to be sure.
Currently i am using only one pipeline which uses "clone" filter plugin and then routing to different outputs, but for some reason it is killing the CPU on the machine. I was able to pinpoint that the clone plugin is the problem.

So my next idea is to avoid using the clone plugin, and seprate this in two piplines

How did you arrive at this conclusion? Can you share your config for the clone filter?

The clone plugin is pretty simple, I would not expect it to have any impact in performance.

Reading the same file with 2 inputs and using 2 different sincedb path should work, but another alternative is to read the pipeline once, and then use the pipeline-to-pipeline communication to send it to the two processing pipelines.

In this case you would have 3 pipelines, one reading the file and 2 processing it.

1 Like

Here is my config

input{
##file
}
filter{
clone {
clones => [ 'MY_clone' ]
}

if [type] == 'MY_clone' {
mutate {
add_field => { "[@metadata][destination]" => "DEST1" }
}
}
else if [some_field] != -5 and [type] != 'MY_clone' {
mutate {
add_field => { "[@metadata][destination]" => "DEST2" }
}
}

}

output {
if [@metadata][destination] == "DEST1"{
rabbitmq {
#xxxx
}
}
else if [@metadata][destination] == "DEST2" {
kafka {
#xxxx
}
}
}

Basically in the filter phase i am just deciding where to send the data based on a @metadata field.
The i just route the out based on the @metadata field

It is probably worth saying that the logstash version is .. ancient (2.3.3). I need to use such an old version becuse of Kafaka compability issues with our very old version of Kafka (0.9) :frowning:

And when i just comment the clone plugin, the cpu is not trashed anymore.
Ofc, i understand that this basically reduces the events in half.

Oh I see.

Maybe it is an issue in this version, it is pretty old and there were many improvements in later version.

Unfortunately you cannot use my suggestion as pipeline-to-pipeline communication does not exist in this version.

It's not "old", it's from the time of dinosaurs and reptiles.
Can you change logic in the output? The clone makes 2 same message, but IF again route to single destination. In case you always need to send to Kafka, then use only if on rabbitmq.

1 Like

Are you doing any other transformation or this is you entire pipeline?

I'm not sure why you are cloning the events, the only difference between them is that for the kafka output you have an extra conditional, why not have this conditional on the output?

output {
   rabbitmq { }
   if [some_field] != -5 {
       kafka {}
    }
}

Or if you want, you may tag the event in the filter block.

filter {
    if [some_field] != -5 {
        mutate {
            add_field => { "[@metadata][destination]" => "kafka" }
        }
    }
}
output {
   rabbitmq {}
   if [@metadata][destination] == "kafka" {
       kafka {}
    }
}

If what you shared is the entire pipeline, I see no reason to use the clone filter.

1 Like

DEST1 needs to have all the events form the input
DEST2 must have all the events except some_field == -5

May this can be simplified w/o cloning...

From what you shared you do not need cloning, just a single conditional to decide if an event should go to an output or not.

The examples I shared above may give you some insight.

:slight_smile:
I did not say "Old"

the logstash version is .. ancient (2.3.3).

Add in the code:
mutate { add_field => { "[@metadata][protection]" => "Shoo shoo dinosaurs and reptiles" } }

:slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.