Multiline settings for handling edge cases of stdout logs of mixed ndjson and java

Hi,
I am currently working with a system that has spring apps running as wars in tomcat, and am currently sending the logs to an ingest pipeline using filebeat. Unfortunately due to legacy reasons, logs are being output containing tomcat startup logging, and application ndjson plus a load of random spring/logback/various which isn't timestamped.

Logs sample is like this

30-Jan-2023 00:16:39.364 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
30-Jan-2023 00:16:39.377 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [49] milliseconds
INFO: Server startup in 589 ms
{"timestamp":"2023-01-30T15:34:25+08:00","message":"some message here","some-other-stuff":"xxx"}
30-Jan-2023 00:16:59.377 ERROR [main] some.java.app.Component
Exception in thread "main" java.lang.NullPointerException
    at Main.main(Main.java:10)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

The following config actually works very well with this...

parsers:
- multiline:
    type: pattern
    pattern: '^\{"|^[0-9]{4}-[0-9]{2}-[0-9]{2}|^[0-9]-[A-Za-z]{3}-[0-9]{4}
    negate: true
    match: after

However, an edge case arises where a stand-alone log entry (without timestamp) follows a ndjson message like so:

{"timestamp":"2023-01-30T15:34:25+08:00","message":"some message here"}
INFO: Server startup in 589 ms

This causes the multiline to append them, as it doesn't detect the "start-of-record", and this prevents the event from being parsed later as json.

It seems like "flush_pattern" might be useful, but it would need to be a pattern that doesn't match, due to negation:

parsers:
- multiline:
    type: pattern
    pattern: '^\{"|^[0-9]{4}-[0-9]{2}-[0-9]{2}|^[0-9]-[A-Za-z]{3}-[0-9]{4}
    negate: true
    match: after
    flush_pattern: '~(\})$'

I've tried several variations on the flush pattern regular expression, but I can't seem to get to negate the match, to mark ndjson messages end.

Any suggestions?

  • ES version: 8.3.3
  • filebeat version: 8.3.3
  • OS: centos 7

Hi, just want to make sure I understand what the goal is. Are you fine with only ingesting lines that have timestamps? Can all the ndjson be ignored?

Yeah, dropping them would be fine. (they are also sent to another log file, which can be parsed separately as single line json events....

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.