Filebeat multiline java app log4j logs

You are looking for some kind of correlation/join operation. In order to pull this of you will need:

  • transaction 'ID'
  • parse/grok message
  • stable (hash-based) partitioning (as you have multiple instances)
  • (time windowed) join/correlation

Re-consider if you really need to pre-corrlate the data in order to build your dashboards. Correlation can be 'expensive' and potentially bloats processing architecture. Also see: Merging rails log lines

Btw. filebeat supports parsing json. Using log4j with json layout and attaching an ID (e.g. via ThreadContext or custom field) one can configure the kafka output in filebeat to hash/partition based on the ID. In this case each Logstash instance will receive all events for a given ID -> you can correlate via aggregate filter. Well, haven't used log4j for years, so no idea how easy it is to build something like this.