Help with Filebeat RegEx Match


(Gary Cherneski) #1

Version: Filebeat 1.2.0-x86-64
I'm new to ELK and Filebeat and read online that it is preferable to do your multiline parsing in Filebeat as opposed to Logstash. I could use help with a RegEx parsing match for Filebeat multiline for the following two datetimestamp patterns that start a line in the same file:
2016-02-07 23:39:14
07 Feb 2016 23:39:47

I want to OR them together and negate both for a multiline, something like ^(regex1 | regex2).
Here is an incorrect stab at the regex model:

multiline:
pattern: '^[^({19|20}{0-9}{0-9} {0-9}{0-9}:{0-9}{0-9}:{0-9}{0-9})|^({0-3}{0-9} {Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec} {19|20}{0-9}{0-9})'

so that any line that does not match either of the above datetimestamps will be appended to the previous line timestamp.
I'm using defaults for: negate, match, max_lines, and timeout.

Sample Log:
2016-02-07 23:39:14 Commons Daemon procrun stdout initialized
07 Feb 2016 23:39:15 INFO EnvironmentVariablesChecker
2016-02-07 23:39:16,182 DEBUG org.hibernate.foo.internal.util.LogHelper - PersistenceUnitInfo [
name: default
persistence provider classname: null
123line example starting with a number
2016-02-07 23:39:17 Commons Daemon procrun stdout initialized
2016-02-07 23:39:17,182 DEBUG org.hibernate.jpa.internal.util.LogHelper - PersistenceUnitInfo [
name: default
persistence provider classname: null
07 Feb 2016 23:39:18 INFO EnvironmentVariablesChecker
I do appreciate your help. Thank you in advance.


(ruflin) #2

Perhaps this doc here can help? https://www.elastic.co/guide/en/beats/filebeat/current/regexp-support.html


(Steffen Siering) #3

What a mumbo-jumbo. Different timestamp formats + sometimes log level and sometimes not... Seems like we really have to use timestamps here.

My solution: https://play.golang.org/p/LD40aV1dcx
Printed lines starting with 'true' (negate is enabled) will be merged with lines before.

My final pattern is '^(20[0-9]{2}(-[0-9]{2}){2} [0-9]{2}(:[0-9]{2}){2})|([0-9]{2} [JFMASOND][a-z]{2} 20[0-9]{2})'. This will only process logs since year 20xx . I shortened list of month into [JFMASOND][a-z]{2}.


(Gary Cherneski) #4

Worked like a charm. I appreciate your help.


(system) #5