Logstash stalls with grok pattern with optional match

Hi there,

Got the following error when trying to match an optional part in the logs. Seems related to Ruby, here is the full log I get when i hit Ctrl+c when Logstash stalls and starts taking every bit of CPU on my 4-CPU machine:

{"inflight_count"=>375, "stalling_thread_info"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>" \\- \\- \\[%{HTTPDATE:timestamp}\\] Host:\"%{HOSTNAME}\" RequestedHost:\"%{HOSTNAME}\" \"Requested:\" \"%{NOTSPACE} %{UNIXPATH:full_path}(%{URIPARAM:parameters})? %{NOTSPACE} %{INT:result_code} %{INT:bytes}"}}]=>[{"thread_id"=>20, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>21, "name"=>"[main]>worker2", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>22, "name"=>"[main]>worker3", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}} {:level=>:warn}
{"inflight_count"=>375, "stalling_thread_info"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>" \\- \\- \\[%{HTTPDATE:timestamp}\\] Host:\"%{HOSTNAME}\" RequestedHost:\"%{HOSTNAME}\" \"Requested:\" \"%{NOTSPACE} %{UNIXPATH:full_path}(%{URIPARAM:parameters})? %{NOTSPACE} %{INT:result_code} %{INT:bytes}"}}]=>[{"thread_id"=>20, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>21, "name"=>"[main]>worker2", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>22, "name"=>"[main]>worker3", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}} {:level=>:warn}
{"inflight_count"=>375, "stalling_thread_info"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>" \\- \\- \\[%{HTTPDATE:timestamp}\\] Host:\"%{HOSTNAME}\" RequestedHost:\"%{HOSTNAME}\" \"Requested:\" \"%{NOTSPACE} %{UNIXPATH:full_path}(%{URIPARAM:parameters})? %{NOTSPACE} %{INT:result_code} %{INT:bytes}"}}]=>[{"thread_id"=>20, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>21, "name"=>"[main]>worker2", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>22, "name"=>"[main]>worker3", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}} {:level=>:warn}
{"inflight_count"=>375, "stalling_thread_info"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>" \\- \\- \\[%{HTTPDATE:timestamp}\\] Host:\"%{HOSTNAME}\" RequestedHost:\"%{HOSTNAME}\" \"Requested:\" \"%{NOTSPACE} %{UNIXPATH:full_path}(%{URIPARAM:parameters})? %{NOTSPACE} %{INT:result_code} %{INT:bytes}"}}]=>[{"thread_id"=>20, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>21, "name"=>"[main]>worker2", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>22, "name"=>"[main]>worker3", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}} {:level=>:warn}
{"inflight_count"=>375, "stalling_thread_info"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>" \\- \\- \\[%{HTTPDATE:timestamp}\\] Host:\"%{HOSTNAME}\" RequestedHost:\"%{HOSTNAME}\" \"Requested:\" \"%{NOTSPACE} %{UNIXPATH:full_path}(%{URIPARAM:parameters})? %{NOTSPACE} %{INT:result_code} %{INT:bytes}"}}]=>[{"thread_id"=>20, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>21, "name"=>"[main]>worker2", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>22, "name"=>"[main]>worker3", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}} {:level=>:warn}
{"inflight_count"=>375, "stalling_thread_info"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>" \\- \\- \\[%{HTTPDATE:timestamp}\\] Host:\"%{HOSTNAME}\" RequestedHost:\"%{HOSTNAME}\" \"Requested:\" \"%{NOTSPACE} %{UNIXPATH:full_path}(%{URIPARAM:parameters})? %{NOTSPACE} %{INT:result_code} %{INT:bytes}"}}]=>[{"thread_id"=>20, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>21, "name"=>"[main]>worker2", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}, {"thread_id"=>22, "name"=>"[main]>worker3", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}} {:level=>:warn}

Any idea?

Hi,

I am seeing this as well.. Did you manage to solve it ?

Hi,

I am also experiencing this problem with the grok filter. Previously, LS was able to ingest my logs without any problems.
My version is logstash-2.3.2-1.noarch.

Could any developer take a look at it?

Can you share your configuration and sample log lines? this looks like an stalled pipeline is preventing you to move forward.

This problem occurs when the number of lines inside the logfile you want to analyze is big. Debugging with a logfile containing 10 lines works and the grok filter evaluates the lines perfectly. It is not a problem with the regular expression patterns and the very same patterns worked before.
Increasing the number of lines makes logstash to stall by the grok filter:
{"inflight_count"=>45, "stalling_thread_info"=>{["LogStash::Filters::Grok", {"patterns_dir"=>["/etc/logstash/patterns/"], "match"=>{"message"=>"%{TRAN
SFER_CLASSIC}"}, "periodic_flush"=>"true", "remove_field"=>["message"]}]=>[{"thread_id"=>20, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/b
undle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}} {:level=>:warn}

I have run logstash on the same machine where the logs are located trying to isolate the problem. I have simplified the configuration file and the problem still persists:
input {
file {
path => "/var/test/*.log"
start_position => beginning
sincedb_path => "/dev/null"
type => "billing"
ignore_older => 0
}
}
filter {

grok {
patterns_dir => ["/etc/logstash/patterns/"]
match => { "message" => "%{TRANSFER_CLASSIC}" }
periodic_flush => true
remove_field => [ "message" ]
}

date {
match => [ "billing_time", "MM.dd HH:mm:ss" ]
timezone => "CET"
remove_field => [ "billing_time" ]
}

}

output {
stdout {
codec => rubydebug
}
}

I want to remark that the patterns have been tested and worked with previous versions of logstash. Version is logstash-2.3.2-1.noarch. Operating system is Scientific Linux 7 (based on RedHat) and we use openjdk.

Any help will be appreciated.

it is most likely due to the grok pattern choking on something.

I ended writing up a manual parser in ruby (plenty of examples available on the interwebs) and parsing the problematic log file manually.

It showed that URLs with "app-android://" choked the parsing process - and I changed the field type to something less strict (like NOTSPACE) and the problem went away..

Mike