Grok pattern with multiline management in filebeat


#1

HI..
i was used to use grok pattern even in multiline 'pattern' section.. it was very useful..
any tip to compensate this on filebeat multiline side??


(ruflin) #2

Grok is currently not supported in filebeat, so there is no work around / compensation. I assume the problem is that the regexp expressions get too complex?


#3

the problem is that this:
pattern => "%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}%{GREEDYDATA}"

become this:
pattern => "(?>\d\d){1,2}-(?:0?[1-9]|1[0-2])-(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]) (?!<[0-9])(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9])(?::(?:(?:[0-5][0-9]|60)(?:[:.,][0-9]+)?))(?![0-9])*"

at least is awful ... think about maintain it ... :confused:

and i still have to try if it works, for now ive just done a literal substitution


(Magnus Bäck) #4

But do you really need to be that picky? Is it remotely conceivable that you'll have lines that begin with e.g. "9999-32-80 ..." that you need to recognize as not starting with a timestamp? If not that regexp can be vastly simplified.


#5

I've logs that starts like

29-Dec-2015 08:16:30,069|N.A.|ajp-/0.0.0.0:8009-6|org.apache.cxf.i......

2015-12-29 08:51:57.433 - DEBUG - 47afe67..................

and are multiline. U, @magnusbaeck, mean that i could simply check a pattern like
(numeric)^4(-)(numeric)^2(-)(numeric)^2 (numeric)^2(:)(numeric)^2(:)(numeric)^2(:).(numeric)^3
to check this, for example?
2015-12-29 08:51:57.433

if yes, is a good point, ill try.


(Magnus Bäck) #6

Yes. If you only need to distinguish between "2015-12-29" and "29-Dec-2015" you don't need a very complicated expression.


#7

Solution:

29-Dec-2015 08:16:30,069|N.A.|ajp-/0.0.0.0:8009-6|org.apache.cxf.i......

become:
^\d\d-[A-Za-z][A-Za-z][A-Za-z]-\d\d\d\d \d\d:\d\d:\d\d,\d\d\d.*

and so on..


(system) #8