Abnormally CPU utilization by Logstash

Dear all,

I have just one grok filter in my logstash and it horrible utilizes CPU up to 1600%

Could you, please help me to detect bottleneck in my filter and fix it?

Thank you in advance.

That's my filter:

filter {

if [fields][type] == "test_st" {
grok {
match => { "message" => ["(?<action_date>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.(?\d{0,3})) [(?<session_id>%{WORD:username}
?[^]])] .HOST[ = ]%{HOSTNAME:db_name}((.|\n))).[\s\n](?<sql_command>[(\w+] ((.|\n)))[\s\n]{(?<d_time>[^}])}",
"(?<action_date>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.(?\d{0,3})) [(?<session_id>%{WORD:username}?[^]]
)] .@(?<db_name>.)[\s\n](?<sql_command>[(\w+] ((.|\n)))[\s\n]{(?<d_time>[^}]*)}"]}
}
date {
match => ["action_date","yyyy-MM-dd HH:mm:ss.SSS"]
target => "@timestamp"
}
}

else if [fields][type] == "test_log" {
grok {
match => { "message" => ["%{DATESTAMP:action_date}\s+[%{LOGLEVEL:log_level}]\s+{(?<app_log>[^}])}\s+[(?<session_id>%{WORD:username}?[^]])]\s+-\s+(?<command_response>(((.|\n))?(?=^\d[^\d])((.|\n)))|((.|\n))*)"]}
}

date {
match => ["action_date","dd-MM-yyyy HH:mm:ss"]
target => "@timestamp"
}
}

else if [fields][type] == "test_d" {
grok {
match => { "message" => ["%{DATESTAMP:action_date} [%{LOGLEVEL:log_level}] {(?<app_log>[^}])} [(?<session_id>%{WORD:username}?[^]])] -.: (?.),User:.,Start date2:(?<start_date>.),End date1:(?<end_date>.*),"]}
}
date {
match => ["action_date","dd-MM-yyyy HH:mm:ss"]
target => "@timestamp"
}
}

else if [fields][type] == "test_error" {
grok {
match => { "message" => ["%{DATESTAMP:action_date} [%{LOGLEVEL:log_level}] {(?<app_log>[^}])} [(?<session_id>%{WORD:username}?[^]])] - [.] (?((.|\n)))"]}
}
date {
match => ["action_date","dd-MM-yyyy HH:mm:ss"]
target => "@timestamp"
}
}

else if [fields][type] == "robin" {
 grok {
 match => { "message" => ["(?<action_date>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME},(?<milliseconds>\d{0,3}))\s*%{LOGLEVEL:log_level}\s\[(?<session_id>(?<sms_id>\d+)?-(?<msisdn>\d+)?-[^\]]*)\]\s*\((?<app_log>[^\)]*)\)\s*(?<command_response>((.|\n))*)"]}
 }
 date {
   match => ["action_date","yyyy-MM-dd HH:mm:ss,SSS"]
   target => "@timestamp"
  }
 }

}

Your grok expressions are very hard to read as you have not formatted the config correctly. I suspect you may have inefficient grok expressions and that might cause the high CPU but without proper formatting it is hard to use well as e.g. * turns text into italic rather than being visible.

Could you, please tell me what is wrong with this expression?

["^%{DATESTAMP:action_date}\s+[%{LOGLEVEL:log_level}]\s+{(?<app_log>[^}])}\s+[(?<session_id>%{WORD:username}?[^]])]\s+-\s+(?<command_response>((.|\n))*)$"]

You need to format your post so that characters are not consumed as markdown. Notice how you expression turns italic half way through. That is because characters from your regexp are being interpreted as markdown.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.