Filebeat multiline java app log4j logs

Need some help with filebeat multiline configuration of java App logs. I'm pasting sample logs from my app.

For each request Ill have a Request and Response lines with some extra lines sometimes. All these logging is done by log4j. Logging lines are always not consecutive for a specific Request, logging lines from other threads could be printed in between.

09-27-2017 13:40:11,760 http-/10.98.88.26:8543-29 INFO com.xxx.xxxx - The username authenticate soap method of Token Authenticate Web Service was invoked Request1
09-27-2017 13:40:11,806 http-/10.98.88.26:8543-29 ERROR com.xxx.xxxx - Ping responded with Http Status code of [400] while performing username login...
09-27-2017 13:40:11,806 http-/10.98.88.26:8543-34 INFO com.xxx.xxxx - The username authenticate soap method of Token Authenticate Web Service was invoked Request2
09-27-2017 13:40:11,807 http-/10.98.88.26:8543-29 INFO com.xxx.xxx - The Token Authenticate Web Service Response Response1
09-27-2017 13:40:11,810 http-/10.98.88.26:8543-34 ERROR com.xxx.xxxx - Ping responded with Http Status code of [400] while performing username login...
09-27-2017 13:40:11,830 http-/10.98.88.26:8543-34 INFO com.xxx.xxx - The Token Authenticate Web Service Response Response2

Any one ?

The multiline feature only works when the lines that need to be grouped are consecutive. I think the Logstash aggregate filter may be applicable here.

Thank you. I kind of know what aggregate filter does.

My current architecture is Filebeat(from multiple hosts) --> KAFKA/Redis(Cluster) --> Logstash(multiple instances) --> ElasticSearch.

Any idea how can this be implemented ?

You are looking for some kind of correlation/join operation. In order to pull this of you will need:

  • transaction 'ID'
  • parse/grok message
  • stable (hash-based) partitioning (as you have multiple instances)
  • (time windowed) join/correlation

Re-consider if you really need to pre-corrlate the data in order to build your dashboards. Correlation can be 'expensive' and potentially bloats processing architecture. Also see: Merging rails log lines

Btw. filebeat supports parsing json. Using log4j with json layout and attaching an ID (e.g. via ThreadContext or custom field) one can configure the kafka output in filebeat to hash/partition based on the ID. In this case each Logstash instance will receive all events for a given ID -> you can correlate via aggregate filter. Well, haven't used log4j for years, so no idea how easy it is to build something like this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.