Ok a colleague of mine made a certain grok filter to create a few fields from suricata logs.
An example of an a log input would be this :
Nov 3 15:20:33 192.168.1.1 suricata[42959]: [1:2200029:1] SURICATA ICMPv6 unknown type [Classification: (null)] [Priority: 3] {IPV6-ICMP} fa80:0000:0000:0000:c233:1224:241a:1efb:143 -> ff03:0000:0000:0000:0000:0000:0000:1119:0
And the grok filter is this :
match => [ "message", "<(?<evtid>.*)>(?<datetime>(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]) (?:2[0123]|[01]?[0-9]):(?:[0-5][0-9]):(?:[0-5][0-9])) (?<prog>.*?): (?<msg>.*)" ]
}
What I need help for :
A)The field evtid always comes out as "141".If I am correct the (?.) creates a field called evtid and the pattern it looks for is "." Here I need help.Thinking back in my compilers class(flex, bison etc) the " . " meant everything except new line.So what would .* mean?Every character 0 or more times? So it's basically counting characters?
B)What's the use of "?:" for?If you look at the grok match the ?: comes up a ton of times.Again my knowledge on regex tells me that [0-1]? means that 0 or 1 will be there 0 or 1 times.It's obvious that it does exactly
that for the months(it can be either jan or january for example).So does it mean that xxxx in (?: xxxx ) might exist or not?But then again it's there before any logical or( ?| ).
I am currently looking into documentation again to figure it out but I would really appreciate some insight from you guys.
Thanks a lot,
Nick.