Hey all,
I'm getting a grok timeout warning with a very specific format of log. Here is my grok filter:
grok {
match => { "message" => "fs_log: %{TIMESTAMP_ISO8601:timestamp} \|\| %{UNIXPATH:filename} \|\| %{WORD:caller} \|\| %{NUMBER:lineno} \|\| %{DATA:message} \|\| %{WORD:event} \|\| %{WORD:module} \|\| %{NOTSPACE:instance_id}" }
overwrite => [ "message" ]
keep_empty_captures => true
add_tag => [ "fs_log" ]
}
As you can see, I am using two pipes ||
to delineate the fields to be parsed by grok.
This works the vast majority of the time. However, there are two instances where I consistently get a grok timeout:
fs_log: 2019-04-12 17:28:24.434573 || /path/file.py || func_name || 338 || Subprocess call Failed: ERROR 2: CPLRealloc(): Out of memory allocating 423383040 bytes.
fs_log: 2019-04-12 17:28:29.940564 || /path/file.py || func_name || 338 || Subprocess call Failed: ERROR 2: ogrlinestring.cpp, 446: cannot allocate 732672016 bytes
These lines are obviously not supposed to match regardless, but then why is there a timeout? Here is a sample message of the warning:
[2019-04-12T17:31:21,999][WARN ][logstash.filters.grok ] Timeout executing grok 'fs_log: %{TIMESTAMP_ISO8601:timestamp} \|\| %{UNIXPATH:filename} \|\| %{WORD:caller} \|\| %{NUMBER:lineno} \|\| %{DATA:message} \|\| %{WORD:event} \|\| %{WORD:module} \|\| %{NOTSPACE:instance_id}' against field 'message' with value 'fs_log: 2019-04-12 17:28:29.940564 || /path/file.py || func_name || 338 || Subprocess call Failed: ERROR 2: ogrlinestring.cpp, 446: cannot allocate 732672016 bytes'!
And just to clarify, logstash is running on a different machine than the one that generates the log, so the fact that the log references memory allocation issues has nothing to do with the grok timeout.
Any help would be greatly appreciated!