Grok performance problem when parsing dots, dashs, underscore

simon.t · October 6, 2017, 9:45am

Hi all,
I'm issuing a performance problem with grok filter.
I use filebeat 5.4.6 to send log file event to logstash 5.4.6.
I made a very simple grok filter in logstash to extract path and filename of the log file from "source" field from filebeat :
grok {
match => { "source" => "%{UNIXPATH:[filepath]}/%{NOTSPACE:[filename]}" }
}

It works very well with a lot of filename but the filter is very slow when there is many dots, dashs, underscore in the filename.
Example : /var/log/nginx/mynginx01access.log -> very fast
/var/log/nginx/my_nginx-01.access.log -> very slow and CPU costly

I try many pattern to replace %{NOTSPACE } whith %{DATA}, %{GREEDYDATA}... whitout any result. The CPU loads for the filter seems to be an exponential of the number of (.,-,) in the filename.
If you replace (.,-,) whith other special charater (#,$,^,space...), it's fast again.

I don't know how to fix this problem, because I try every possible pattern.

Help would be very appreciated.

Simon

guyboertje · October 6, 2017, 10:20am

Read this https://www.elastic.co/blog/do-you-grok-grok

then after that try:
^%{UNIXPATH:[filepath]}/%{JAVAFILE:[filename]}$

simon.t · October 6, 2017, 11:09am

Thank you for your response.
First, I made a mistake on the versions of filebeat and Logstash, i work on the latest 5.6.2 on CentOS 7u3.
I try to change the match expression with your hint but it didn't solve the problem. I reproduce the same bad execution time.

simon.t · October 6, 2017, 11:19am

I made a few more tests and I solved the problem by changing UNIXPATH pattern by DATA or GREEDYDATA.
grok {
match => { "source" => "^%{DATA:[fields][filepath]}/%{JAVAFILE:[fields][filename]}$" }
}

I don't understand how the matching on UNIXPATH pattern is dependant on the format of the last part of string after the "/".

Thank you for your first response.

system · November 3, 2017, 11:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Timeout executing grok 5.4.3 Logstash	3	1078	August 10, 2017
Grok failed to parse simple string and logstash stuck at 100% CPU Logstash	5	968	January 19, 2018
Grok filter problem Logstash	3	903	July 6, 2017
Logstash issue grok Logstash	12	995	July 6, 2017
Take out bits of a URIPATH in Logstash Logstash	21	4979	July 6, 2017

Grok performance problem when parsing dots, dashs, underscore

Related topics