As I'm instrumenting a new application for log monitoring via filebeat
, I came across a log format I've never seen before.
It's a multiline log format, but instead of just continuing, each continuing line contains a copy of the header for the log entry, as follows:
[Thread |2640|D] 2019-06-13 00:00:00.021-05:00 Line 1 of message:
[Thread |2640|H] 2019-06-13 00:00:00.021-05:00 Line 2 of message:
[Thread |2640|H] 2019-06-13 00:00:00.021-05:00 Line 3 of message
[Thread |2640|H] 2019-06-13 00:00:00.021-05:00 Line 4 of message
[Thread |2640|H] 2019-06-13 00:00:00.021-05:00 Line 5 of message
Taking the following example of a line header:
[Thread |2640|D]
-
Thread
is obvious - It represents the thread name. -
2640
- This is actually a 2-byte hex number that appears to be a message ID. However there are some duplicates of this between consecutive messages, so I'm not certain about this. -
D
- This is a single-character code, that appears to be message severity, i.e.,DEBUG
, in this case. However, subsequent entries that are clearly part of the same message have a code ofH
, which appears to indicate a continuing line.
My questions are:
- Is it possible to write a filebeat multi-line pattern to concatenate these lines? What would that pattern look like?
- Assuming that I can write a filebeat multi-line rule to concatenate these lines, how would I write a logstash rule to clean up the subsequent lines, if possible?