Grok performance

shwesinhan · December 20, 2017, 7:58am

hi there!

i use multiple grok match patterns in my logstash filter.

eg:

filter {
grok {
# grok1
break_on_match => true
match => { "message" => [ "^(?<log_timestamp>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{1,3}).?(?\d+)\s{0,}%{WORD:log_level}.?\s:\sV21.OneTwoThreeHelper\s:\sPageRequest\s{0,}::\s{0,}(?<session_id>\w+)\s{0,}.?.?(?[0-9a-zA-Z]{0,})(?.?)"]}
add_field => { "project_type" => "PROJ1" }
add_field => { "transaction_type" => "REQUEST" }
}
grok {
# grok2
break_on_match => true
match => { "message" => [ "(?<log_timestamp>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{1,3}).(?\d+).?%{WORD:log_level}.?\s:\sPageResponse\s{0,}::\s{0,}(?<session_id>\w+)\s{0,}::\s{0,}.?.?(?[0-9a-zA-Z]{0,})(?.?)"]}
add_field => { "project_type" => "PROJ2" }
add_field => { "transaction_type" => "RESPONSE" }
}
grok {
# grok3
break_on_match => true
match => { "message" => [ "(?<log_timestamp>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{1,3}).(?\d+).?%{WORD:log_level}.?\s:\sPage3\s{0,}::\s{0,}(?<session_id>\w+)\s{0,}::\s{0,}.?.?(?[0-9a-zA-Z]{0,})(?.*?)"]}
add_field => { "project_type" => "PROJ3" }
add_field => { "transaction_type" => "RESPONSE" }
}
grok {....}
grok {....}
grok {grok40}
}

by using break_on_match => true, if log event match grok2, then exit grok2 or exit filter?
i use 40 groks line, i’m not sure about performance impact on using multiple grokking. may i know how to improve performance and how to prepare / manage the logstash performance?

if someone give advice those, appreciate

Christian_Dahlqvist · December 20, 2017, 4:47pm

Grok allows you to define multiple panterns in the same block, and this is where the 'break_on_match' parameter is useful in order to stop processing once a match is found. If you have multiple grok blocks, they will all be evaluated as the parameter does not span across blocks.

shwesinhan · December 21, 2017, 2:24am

hi @Christian_Dahlqvist

thank you

i thought if grok match, exit grok block ignoring next line which is add_field.
that's why i wrote multiple block

right now, i change one grok block and turn on break_on_match
then... i found that

project_type: PROJ1,PROJ2,PROJ3,.....PROJ40
transaction: REQUEST,RESPONSE,TEST,etc.

break_on_match only ignore next patterns but it still on going add_field.
what should i change about that? plz, kindly check my script again.

neal1991 · December 21, 2017, 7:33am

Christian_Dahlqvist · December 21, 2017, 4:04pm

It will add field if there is a successful match for one of the panterns in the block. Can you perhaps define the project at the source, e.g. in Filebeat, or using conditionals on the fields after the data has been extracted?

system · January 18, 2018, 4:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiple grok pattern, really multiple? Logstash	2	213	April 28, 2022
Logstash 8.1 multiple patterns Logstash	2	162	December 22, 2023
GROK Multiple Match - Logstash Logstash	4	27109	July 6, 2017
Want to match against single grok pattern and multiple patterns in same filter Logstash	1	257	September 23, 2020
How to handle multiple match in logstash filter Logstash	6	9972	September 8, 2017

Grok performance

Related topics