Hi,
I'm trying to configure FIlebeat to process a log file where records are mostly spread over multiple lines separated by a blank line but occasionally aren't.
Here's an example:-
2018-07-02T21:10:09.775 Start ProcessXMLMessage
2018-07-02T21:10:09.775 Start ThingXML
2018-07-02T21:10:09.776 Before CommitData. CodeID=AQO2 741392570000103001 User=5001 Date=02072018 Time=210912 Scan=ABC1 Length=0041 Width=0059 Height=0061 Weight=0000000000
2018-07-02T21:10:09.799 End WeightXML. op_Errors=
2018-07-02T21:10:09.802 End ProcessXMLMessage
2018-07-02T21:10:09.923 Start ProcessXMLMessage
2018-07-02T21:10:09.924 Start ThingXML
2018-07-02T21:10:09.926 Before CommitData. CodeID=AHF5 939988635943627001 User=5001 Date=02072018 Time=210841 Scan=ABC1 Length=0006 Width=0018 Height=0021 Weight=0000000000
2018-07-02T21:10:10.071 End WeightXML. op_Errors=
2018-07-02T21:10:10.072 End ProcessXMLMessage
2018-06-30T22:21:58.211 Start ProcessXMLMessage
2018-06-30T22:21:58.212 Start IODXML
2018-06-30T22:21:58.213 IODXML Item=50000003388090
2018-06-30T22:21:58.214 ProcessData User=170005 ItemNumber=50000003388090 Items=1 FailureCode=00 ProcessedDate=30062018 ProcessedTime=1415
2018-06-30T22:21:58.215 ProcessData ll_ValidFailCode=TRUE FailureCode=00 ldte_ImpDate=30/06/2018
2018-06-30T22:21:58.215 ProcessData GPS Coordinates=51.754193,0.006483 GPS DoP=1 GPS Date/Time=30/06/2018 14:15:51
2018-06-30T22:21:58.240 ProcessData loop end ItemNumber=50000003388090
2018-07-02T21:10:45.595 Item Number 31015080070677 Does Not Exist (DISCREPSCN1.10 G3101574370677000 501081910710071009451050DC01 )
2018-07-02T21:11:09.381 Start ProcessXMLMessage
2018-07-02T21:11:09.383 Start WeightXML
2018-07-02T21:11:09.387 Before CommitScanData. CodeID=ABA14831408400321777001 User=5001 Date=02072018 Time=210936 Scan=ABC1 Length=0000 Width=0000 Height=0000 Weight=0000000000
2018-07-02T21:11:09.422 End WeightXML. op_Errors=
2018-07-02T21:11:09.423 End ProcessXMLMessage
I've successfully grouped multiple lines by grouping lines that start with a number (all of the data is timestamped), using this simple pattern:-
multiline.pattern: '[0-9]'
multiline.negate: false
multiline.match: after
However, the data above is five records, not four. The fifth record is the single line six rows before the end ("Item Number ... Does Not Exist"). This particular message is the only one that appears as a single line and the only one that appears without being surrounded by blank lines.
I've tried a regexp to look for blank lines and the "Does Not Exist" line, but it doesn't seem to work, possibly because I want to discard the blank lines but keep the "Does Not Exist" one:-
(^\r?\n)|(Does Not Exist)
Can anyone point me towards a multiline config that will split this data up as required?
Thanks,
J.