Grock Parsing high CPU not stopping

SLES 12 SP3
ELK 7.1.0
Hello everybody,

I have a problem with my Logstash, whenever i create a grock pattern for a certain part of my log, my CPU keep going up and stopping there. It look like the thread with the grock is freezed :

[2019-05-24T15:25:33,507][WARN ][org.logstash.execution.ShutdownWatcherExt] {"inflight_count"=>0, "stalling_threads_info"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>"%{YEAR}-%{MONTHNUM}-%{MONTHDAY}[ T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}? %{IPV4:IP} %{UNIXPATH:Chemin} %{TIMESTAMP_ISO8601:logTime} %{LOGLEVEL:Loglevel} %{WORD:Domain}\\%{WORD:user} %{GREEDYDATA:AppLog}"}, "id"=>"3f4d487 ba4a68ee9d37bc51493bb156a295dee3894b8dcf78137ecfb25ab1f9d"}]=>[{"thread_id"=>25, "name"=>"[main]>worker0", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:182:in `match '"}, {"thread_id"=>26, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:182:in `match'"}, {"thread_id"=>28, "name"=>"[main]>worker3", "current_ call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:182:in `match'"}, {"thread_id"=>29, "name"=>"[main]>worker4", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11 .5/lib/grok-pure.rb:182:in `match'"}, {"thread_id"=>30, "name"=>"[main]>worker5", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:182:in `match'"}]}}

The filter that I try to match is :

%{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}? %{IPV4:IP} %{UNIXPATH:Chemin} %{TIMESTAMP_ISO8601:logTime} %{LOGLEVEL:Loglevel} %{WORD:Domain}\%{WORD:user} %{GREEDYDATA:AppLog}

An exemple of log would be :

2019-05-24T15:05:14.957392+02:00 10.254.177.1 /LM/W3SVC/1/ROOT/WS_PCIDSS2-1-132031733574297090: 2019-05-24 15:05:14,040 DEBUG Domain\User VMTESTTOTO DCC/Action : CNXname = VMFDTOtoto

I tried to run Logstash in debug mode but nothing out of the ordinary came up :

https://gist.githubusercontent.com/Antoine-SSAG/5685613990ffd0bfe13f991726c666b2/raw/f84b097cf740fdb6395d4aa6a97ef70212d9d436/debug.txt

The parsing is failling when I try to get the Windows user in the log after the Level ( Domain\User).

My input is :

input {
beats{
port => 5018
type => AppPCI
tags => [P101]
}
}
filter {
grok {
match => { "message" => "%{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}? %{IPV4:IP} %{UNIXPATH:Chemin} %{TIMESTAMP_ISO8601:logTime} %{LOGLEVEL:Loglevel} %{WORD:Domain}\%{WORD:user} %{GREEDYDATA:AppLog}" }
}
}
output {
if [type] == "AppPCI" {
elasticsearch {
index => "%{[@metadata][beat]}-%{[@metadata][version]}-apppci-%{+YYYY.MM.dd}"
hosts => ["http://10.254.160.15:9200"]
}
}
}

I'm really getting desesperate so if somebody have an idea it would be really nice.

Thanks in advance,
Antoine

I think you dont need the part before %{IPV4:IP}.
And ->
%{WORD:Domain}\%{WORD:user} is escaping the "%" so you need another one like %{WORD:Domain}\\%{WORD:user}

I hope this will help.

1 Like

Do not use UNIXPATH. It is extremely CPU intensive.

1 Like

Hello Logger,

Thanks for taking the time to help, I removed to part before the %{IPV4:IP} and it works perfectly, :smile:
I also added the \ before i'm not sure why it showed as a single backslash.

Thanks for the tips ^^

Hello Badger,

YOu are my saviour ! I don't know why but the UNIXPATH pattern was what caused the grock to crash, when i use a custom one, eveything works perfectly fine.

If it happen to other people, I used this custom regex :
URL (?=/)((.*?))(?=)

In the end with the advice of @logger my filter look like that :

%{IPV4:IP} %{URL:Chemin} %{TIMESTAMP_ISO8601:logTime} %{LOGLEVEL:Loglevel} %{WORD:Domain}\%{WORD:user} %{MESSAGE:AppLog}

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.