Logstash best practice for filters order

Hi, I have a lot of input and filters on my logstash and I've been thinking are they optimize and how can I measure logstash performance. I only monitor "heap used" and it's fine. I came up with a question. There are 2 types of logstash config file. Is there any differences between them ?
I mean in speed of events processing and parsing logs.
This is Type 1:

filter 1{
              grok1 { .......... }
              geoip { .......... }
              ip2location { .......... }
              ruby { .......... }
} 
filter 2{
              grok2 { .......... }
              geoip { .......... }
              ip2location { .......... }
              ruby { .......... }
}
filter 3{
              grok3 { .......... }
              geoip { .......... }
              ip2location { .......... }
              ruby { .......... }
}

and this is Type 2:

filter1 {
          grok1 { .......... }
}
filter2 {
          grok2 { .......... }
}
filter2 {
          grok2 { .......... }
}

filter {
          geoip { .......... }
          ip2location { .......... }
          ruby { .......... }
}

My question is that the number of duplicate functions and filters are related to performance ?
The type 2 can be more efficient than type 1 or my guess is wrong ?
I wonder how can I measure the logstash config files performance.
I know it's a very general question but I just need a little basic ideas and basic best practices to follow.
Thank you so much.

The thing is that Logstash will want to merge all of these together at run time. Are they all processing the same logs, or different types? If they are different, can you use pipelines?

As @warkolm said, Logstash will merge the blocks in the run time.

So the following example

filter { 
    grok1
    geoip
    etc
}
filter { 
    grok2
    geoip
    etc
}

Will become this when logstash starts

filter { 
    grok1
    geoip
    etc
    grok2
    geoip
    etc
}

Logstash will merge the blocks, this applies to input, filter and output, but it will keep the order of the inputs, filters and outputs.

As for the performance question, it will depend if you are using any conditionals in your filter blocks or not, if you are not using any conditional, then every event will pass through every filter, so having less filters will be better for performance.

In this case, your Type 2 example should have a better performance just because you are applying the geoip, ip2location and ruby filters only once per event, in your Type 2 example you are applying those filters three times per event.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.