Grok filter failing


#1

I have created a grok filter for JBoss logs. The grok fails when I include %{TIME:time} %{LOGLEVEL:level} and I can not read time and level

grok {
match => [
"%{TIME:time} %{LOGLEVEL:level} [(?[^]]+)] ((?[^)]+)) %{GREEDYDATA:message}"
]
overwrite => ["message"]
}

Examples:
2018-03-08 00:00:12,126 INFO [tellapp] (tellapp Listener - Thread-24) Connection coming from 10.170.133.10 on 4200 as 64275
2018-03-08 00:00:12,126 INFO [tellapp] (Thread-62684) End of Stream reached Closing connection. -1
2018-03-08 00:00:12,126 INFO [tellapp] (Thread-62684) Connection closed.
2018-03-08 00:00:27,126 INFO [tellapp] (tellapp Listener - Thread-24) Connection coming from 10.170.133.10 on 4200 as 28445
2018-03-08 00:00:27,126 INFO [tellapp] (Thread-62685) End of Stream reached Closing connection. -1
2018-03-08 00:00:27,126 INFO [tellapp] (Thread-62685) Connection closed.
2018-03-08 00:00:42,126 INFO [tellapp] (tellapp Listener - Thread-24) Connection coming from 10.170.133.10 on 4200 as 27809
2018-03-08 00:00:42,126 INFO [tellapp] (Thread-62686) End of Stream reached Closing connection. -1

Thank you


(Ry Biesemeyer) #2

From what I can tell, the problem may be more to do with the raw regular expressions in the middle:

  • because [, ], (, and ) carry special meaning in a regular expression, they need to be prefixed with a backslash (\) whenever attempting to match a literal character.
  • the non-capture grouping (?: expression ) is both unnecessary and missing a colon, which creates a syntax error in the underlying regular expression.

I took your example lines, put them in the Grok Constructor, and fiddled with the patterns.


Below, I have fixed the escaping (backslashing the literal open- and close-brackets, as well as the close-bracket in the negative character class), and removed the unncessary-and-not-quite-right non-capture grouping:

%{TIME:time} %{LOGLEVEL:level} \[[^\]]+\] \([^\)]+\) %{GREEDYDATA:message}

I also noticed that we should probably be using TIMESTAMP_ISO8601 to capure the timestamp, since it includes the date portion of the capture:

%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:level} \[[^\]]+\] \([^\)]+\) %{GREEDYDATA:message}

Since we're attempting to match from the beginning of the string, we can make grok fail faster by anchoring our pattern to the start of the string -- prefixing it with the ^ anchor :

^%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:level} \[[^\]]+\] \([^\)]+\) %{GREEDYDATA:message}

If we add a couple more named grok patterns, we can capture the bracketed- and parenthesised groups too:

NOTCLOSEBRACKET [^\]]+
NOTCLOSEPAREN [^\)]+
^%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:level} \[%{NOTCLOSEBRACKET:application}\] \(%{NOTCLOSEPAREN:context}\) %{GREEDYDATA:message}

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.