Is there a filebat multline filter for 'start xxxx end' string?


(zoplex) #1

Hi,

if there is a filebeat multiline filter for the following input lines:

" STARTOFSTRING ..
....
....
... ENDOFSTRING

could that be captured with filebeat.yml? The number of lines in between is variable, and the file itself also has other lines that are not of this format but rather regular ones or other multiliners ...

I tried couple of REGEX-es but could not get filebeat to work - it throws errors into the log file after starting ..

Thanks


(Magnus Bäck) #2

I doubt this is possible. If all messages had been delimited like this it would've been easy.


(zoplex) #3

Understood - thank you Magnus;

If this was to be done in logstash/grok, then this would be possible? Is there any code sample for that - if it is possible.
The variable number of lines between the start and the end token is unfortunately unavoidable as it is the way the source data is.

Thank you,


(zoplex) #4

.. another option - if we can say that we are ok with grabbing first 10 lines of that multiline and we fix that number - to avoid dealing with variable number of lines - would that help to get it processed inside the filebeat.yml?


(Magnus Bäck) #5

If this was to be done in logstash/grok, then this would be possible?

A grok filter can't join multiple lines. Logstash's multiline codec and filter work in the same way as Filebeat.

if we can say that we are ok with grabbing first 10 lines of that multiline and we fix that number - to avoid dealing with variable number of lines - would that help to get it processed inside the filebeat.yml?

Filebeat's multiline feature doesn't give you an option to join the current line with the next 10 lines so I don't see how that would make a difference.


(zoplex) #6

ok - thank you for confirming Magnus.


(Steffen Siering) #7

You're basically asking for some kind of bracketing. This is not supported by filebeat multiline yet. Have you got a more complete sample log to share? Sometimes looking at logs one can still find some kind of structure one can pattern match on.


(zoplex) #8

Sure - here is one example of that log pattern - so it starts with the "Quorum results" line and ends with the 'SYNCED" line - with variable number of lines in between; we need to get this into one record in ES - I don't think it is critical for that to be done at filebeat - could be logstash/grok too:

2016-07-29 11:43:40 13167 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 6,
members = 3/3 (joined/total),
act_id = 49,
last_appl. = -1,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = c843ba16-5038-xxxxxxxxx-9e850c396f04
2016-07-29 11:43:40 13167 [Note] WSREP: Flow-control interval: [400, 400]
2016-07-29 11:43:40 13167 [Note] WSREP: Restored state OPEN -> JOINED (49)
2016-07-29 11:43:40 13167 [Note] WSREP: New cluster view: global state: c843ba16-5038-xxxxxxxxx-9e850c396f04:49, view# 7: Primary, number of nodes: 3, my index: 1, pro
tocol version 3
2016-07-29 11:43:40 13167 [Note] WSREP: SST complete, seqno: 49
2016-07-29 11:43:40 13167 [Note] WSREP: Member 1.0 (node2) synced with group.
2016-07-29 11:43:40 13167 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 49)


(Steffen Siering) #9

well, all line with <name> = ... are easy to capture. The sample is a little incomplete, though?

  1. Does log file only contain multiline events always starting with Quorum results and SYNCED? Or is there potentially some more content in log file?

  2. shall all these lines (including log-lines) be included?

  3. can multiple events be intermixed?


(zoplex) #10

There are other events in the log file - different formats, and they are NOT intermixed. I do need to flatten out the whole example above in one log line in ES - all 17 lines in this example - but other examples of the same format could have different number of lines and different info between the start and end line. So format of the first and the last line is known:
FIRST LINE: 2016-07-29 11:43:40 13167 [Note] WSREP: Quorum results:
..
middle lines - UNKNOWN FORMAT
...
LAST LINE: 2016-07-29 11:43:40 13167 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 49)

but format of the in-between lines is generally not known.


(Steffen Siering) #11

Having different format 'inside' the multiline event and potentially similar formats 'outside' of the bracketing I indeed see no way doing it with simple regexes.

The initial proposal did contain support for start/end event markers, but was not implemented at time. Feel free to open an enhancement request.


(system) #12

This topic was automatically closed after 21 days. New replies are no longer allowed.