How to optimize the multiple configuration files to improve the performance for too many filters


(Hyzhang6639) #1

We have several configuration files include about 50 filters. It take about 9 mins to filter 10,000,000 events.
After reducing to 30 filters, it take around 6 mins to complete to filter 10,000,000 events.

In the future, we need to increase more filter and concerned the performance if too many filters are used. We thought to develop the new filter plugin to improve the performance. It appears to be time-consuming to develop a new filter plugin. We're wondering if some other solutions can improve the performance if many filters(100+) are used.


(Christian Dahlqvist) #2

Are you able to share your config?


(Hyzhang6639) #3

Thanks for your updates. The below is the example for the config. The configuration file includes some "if...else if" statement.

if [Type1] == "..." {
if [message] {
csv {
columns => [ "..." ]
autogenerate_column_names => false
}

        if [Type2] == "..." {
            csv {
                columns => [
                    ...
                 ]}
      else if [Type2]=="..."{
              csv {
                columns => [
                    ...
                 ]}
     ....

(Christian Dahlqvist) #4

Is it all a lot of conditional around different csv formats? Any other type of processing that could be slow? What inputs do you have? How many columns do the csv files typically have?


(Hyzhang6639) #5

Yes. The conditional is all around csv formats. The filter are mostly used csv and the configuration options are mostly use "columns" and "convert" for csv filter.
The event source is the log file for the OS and some applications.


(Hyzhang6639) #6

In addition typically there are around 50+ columns in the csv event source.


(Christian Dahlqvist) #7

You may want to look into using thedissect filter instead of the csv filter as it can be considerably faster.


(Hyzhang6639) #8

Hi Christian
Thanks for your updates. I will test if filter dissect can improve the performance in our scenario. In addition I also found another symptom. If the lowercase option is commented in "mutate" filter as the following, the transfer rate will be increased from 24KiB/s to 30KiB/s. Is it working as designed? Whether some solution can replace "lowercase" in "mutate" filter to improve the performance?

#lowercase => [ "host", "Column1" ]


(Hyzhang6639) #9

In addition I tried the similar scenario in another box, there is no obvious difference between enabling lowercase and disabling lowercase option as the following. I am wondering which parameter or setting can trigger the difference for the option lowercase?

9.54MiB 0:05:44 [28.4kiB/s] (Enable lowercase)
9.54MiB 0:05:35 [29.1kiB/s] (Disable lowercase)


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.