Timeout executing grok

Hello I am getting a timeout issue when running a grok filter. My logstash config looks like so.

filter {
if [type] == "cast" {
grok {
break_on_match => false

             match => { 'message'=> '\[%{DATA:timestamp}\]  \[%{DATA:pool}\] %{WORD:ignore} %{NUMBER:pid}'}

             match => { 'message' => '%{DATESTAMP:timestamp} \[%{WORD:status}\] %{DATA:ignore}\: \*%{NUMBER:ignore} %{GREEDYDATA:error-message}'}

             match => { 'message' => '%{IPORHOST:ip} \- \- \[%{HTTPDATE:timestamp}\] "%{WORD:ignore} %{PATH:a-info}%{DATA:ignore}&%{WORD:ignore}=%{NUMBER:font}%{DATA:ignore}" %{INT:http_response} %{INT:wall} "-" %{DATA:ignore}\(%{WORD:OS}; %{DATA:ignore}\) %{WORD:browser}%{GREEDYDATA:ignore}'}

             match => { 'message' => '%{IPORHOST:ip} \- \- \[%{HTTPDATE:timestamp}\] "%{WORD:ignore} %{PATH:a-info}%{DATA:ignore}&%{WORD:ignore}=%{NUMBER:font}%{DATA:ignore}" %{INT:http_response} %{INT:wall} "%{URI:URL}" %{DATA:ignore}\(%{WORD:OS}%{DATA:ignore},%{DATA:ignore}\) %{WORD:browser}%{GREEDYDATA:ignore}'}

}

I checked the message it is time outing on and it matches on of the filters above. I dont get why it would timeout. Any help would be greatly appreciated.

just updated stack to latest and greatest, still the same issue.

ive tried adding tag_n_timeout and then grab that time out and jsut grok match it to %{GREEDYDATA:timeout} and it doesn't work.

What is the exact error, please post the log.

Timeout executing grok '%{IPORHOST:ip} - - [%{HTTPDATE:timestamp}] "%{WORD:ignore} %{PATH:a-info}%{DATA:ignore}&%{WORD:ignore}=%{NUMBER:font}%{DATA:ignore}" %{INT:http_response} %{INT:wall} "-" %{DATA:ignore}(%{WORD:OS}; %{DATA:ignore}) %{WORD:browser}%{GREEDYDATA:ignore}' against field 'message' with value 'Value too large to output (431 bytes)! First 255 chars are:

anyone have an idea of what can be causing this?

added x-pack to logstash and see that CPU is at 95-97% is this because of the bad filter or could this be causing it?

Does monitoring show you where it's spending most of its time?

it does not, unless i am looking in the wrong place?

You have a few occurrences of DATA and GREEDYDATA that tend to match a lot and can be expensive/slow to compute. Try to be more specific. It also looks like you have a few fields named ignore. If you do not want to capture the data, I believe you can simply just not name it.

using greedydata is just to grab the rest of the log? Is there a better work around than grabbing each piece of data?

The last GREEDYDATA is fine, but you may want to be more specific and replace the DATA patterns earlier in the pattern. If you show what a log line looks like we may be able to provide better guidance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.