Grok regex with escaped "[", "]", "(", and ")" chars doesn't work


#1

Elastic newbie here - working with a new 5.5 install. I have a log line that looks like so:
[2015/10/01@19:48:22.785-0400] P-4780 T-2208 I DBUTIL : (451) prostrct create session begin for timk519 on CON:.

I have the following regex:
\[%{DATE:date}@%{TIME:time}-(?<gmtoffset>\d{4})\]\s*(?<procid>P-[0-9]+)\s*(?<threadid>T-[0-9]+)\s*(?<msgtype>[ifIF])\s*(?<processtype>[a-zA-Z]+)\s*(?<usernumber>[0-9]+|[:])\s*\((?<msgnum>[0-9]+|[\-]+)\)\s*%{GREEDYDATA:message}

When I try it in the kibana grok debugger it doesn't work and I get the following error:
GrokDebugger: [parse_exception] [pattern_definitions] property isn't a map, but of type [java.lang.String], with { header={ processor_type="grok" & property_name="pattern_definitions" } }

This regex works if I remove the [ and ] from the regex and the log line. I've tried doing single, double, and triple escape on the [ and ] - to no avail.

I was able to escape the () around the msgnum tag which leaves me puzzled why escaping the [ ] characters doesn't work.

What am I missing?


#2

This hack works -
%{DATE:date}@%{TIME:time}-(?<timezone>\d{4}).\s*(?<procid>P-[0-9]+)\s*(?<threadid>T-[0-9]+)\s*(?<msgtype>[ifIF])\s*(?<processtype>[a-zA-Z]+)\s*(?<usernumber>[0-9]+|[:])\s*\((?<msgnum>[0-9]+|[\-]+)\)%{GREEDYDATA:message}

I added a "." after the {4}) to get past the ] - and it looks like so:

%{DATE:date}@%{TIME:time}-(?<timezone>\d{4}).

I'd prefer a solution which is explicit about the [ and ].


#3

This also works - add a "." as the leading char, and add a ] to capture the trailing ].

.%{DATE:date}@%{TIME:time}-(?\d{4})\]

I get the impression the regex processor doesn't like a leading [ or \[

Well isn't that interesting - the editor a interpreting a \ as an escape character and only showing the next character...


#4

This pattern works in grok debut test in kibana, yet fails in production:

.%{DATE:date}@%{TIME:time}-(?<timezone>\d{4}).\s*(?<procid>P-[0-9]+)\s*(?<threadid>T-[0-9]+)\s*(?<msgtype>[ifIF])\s*(?<processtype>[a-zA-Z]+)\s*(?<usernumber>[0-9]+|[:])\s*\((?<msgnum>[0-9]+|[\-]+)\)\s*%{GREEDYDATA:message}

The problem is the escaped "(" and ")" used to extract the msgnum number.

Anyone?


#5

turns out the {DATE} was the incorrect format in that it only had a 2-digit year while the log had a 4 digit year. Once that was changed to a custom format that matched the 4 digit year in the log date everything behaved normally.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.