Hi,
I am currently working with a system that has spring apps running as wars in tomcat, and am currently sending the logs to an ingest pipeline using filebeat. Unfortunately due to legacy reasons, logs are being output containing tomcat startup logging, and application ndjson plus a load of random spring/logback/various which isn't timestamped.
Logs sample is like this
30-Jan-2023 00:16:39.364 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
30-Jan-2023 00:16:39.377 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [49] milliseconds
INFO: Server startup in 589 ms
{"timestamp":"2023-01-30T15:34:25+08:00","message":"some message here","some-other-stuff":"xxx"}
30-Jan-2023 00:16:59.377 ERROR [main] some.java.app.Component
Exception in thread "main" java.lang.NullPointerException
at Main.main(Main.java:10)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
The following config actually works very well with this...
parsers:
- multiline:
type: pattern
pattern: '^\{"|^[0-9]{4}-[0-9]{2}-[0-9]{2}|^[0-9]-[A-Za-z]{3}-[0-9]{4}
negate: true
match: after
However, an edge case arises where a stand-alone log entry (without timestamp) follows a ndjson message like so:
{"timestamp":"2023-01-30T15:34:25+08:00","message":"some message here"}
INFO: Server startup in 589 ms
This causes the multiline to append them, as it doesn't detect the "start-of-record", and this prevents the event from being parsed later as json.
It seems like "flush_pattern" might be useful, but it would need to be a pattern that doesn't match, due to negation:
parsers:
- multiline:
type: pattern
pattern: '^\{"|^[0-9]{4}-[0-9]{2}-[0-9]{2}|^[0-9]-[A-Za-z]{3}-[0-9]{4}
negate: true
match: after
flush_pattern: '~(\})$'
I've tried several variations on the flush pattern regular expression, but I can't seem to get to negate the match, to mark ndjson messages end.
Any suggestions?
- ES version: 8.3.3
- filebeat version: 8.3.3
- OS: centos 7