The documentation for the Grok filter states that to add a pattern to a custom patterns file, you:
- write the pattern you need as the pattern name, a space, then the regexp for that pattern.
For example, doing the postfix queue id example as above:
# contents of ./patterns/postfix: POSTFIX_QUEUEID [0-9A-F]{10,11}
But when I look at some of the patterns that ship with Logstash, I see:
#Space is an allowed character to match special cases like 'Native Method' or 'Unknown Source'
JAVAFILE (?:[a-zA-Z$_0-9. -]+)
#Allow special <init>, <clinit> methods
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)
#Line number is optional in special cases 'Native method' or 'Unknown source'
JAVASTACKTRACEPART %{SPACE}at %{JAVACLASS:[java][log][origin][class][name]}\.%{JAVAMETHOD:[log][origin][function]}\(%{JAVAFILE:[log][origin][file][name]}(?::%{INT:[log][origin][file][line]:int})?\)
# Java Logs
JAVATHREAD (?:[A-Z]{2}-Processor[\d]+)
JAVALOGMESSAGE (?:.*)
A lot of the patterns are wrapped in an extended group (?:...)
.
This confused the heck out of me. Was I supposed to do this in my own custom patterns as well? Why were they doing this some patterns (JAVAFILE
, JAVAMETHOD
) but not in others (JAVASTACKTRACEPART
)?
After some trial and error in the Grok Debugger I've confirmed that wrapping the Grok expressions (which are already regular expressions) in an extended group is completely superfluous.
Can we get rid of this then and keep these expressions as simple as possible and avoid confusing people new to Logstash like myself?
Or am I missing something here?
Thanks,
Frans