Filebeat: multiline: introduce merge by using max-lines as condition in stead of pattern

Once in a while people like to merge messages into a single line not based on a pattern but based on the number of lines that have to be merged. This may be caused by not having a clear usable pattern or by just wanting to reduce the number of lines in a message by combining several. There are situations that it may also be handy to combine the lines into a JSON-array that can be used by other applications.

I propose to introduce an extra multiline parameter kind that distinguishes this behavior. Of course all the other parameters are still valid so in theory you can combine the pattern and the max_lines parameters. Although in practice I do not expect that.

The values of the kind parameter would be <<empty>> (default and current implementation), merge, and merge-json, where merge-json will combine the messages in a JSON-array.

Hi @williamd67, welcome to the Elastic community forums!

This is an interesting suggestion. I'd like to make sure I understand the problem you're trying to solve first. Let's take the example from the multiline documentation:

[beat-logstash-some-name-832-2015.11.28] IndexNotFoundException[no such index]
    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(
    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(
    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(
    at org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction.checkBlock(

My understanding of your use case is that you might want to only take the first, say, 3 lines of the above example and merge them into a single event. In that case

So in this case you'd want a pattern to know where each multiline message starts and then the maximum # of lines you want to merge. But that could be accomplished with the pattern, match, and max_lines settings, right? Do we need to introduce the kind: merge option? That seems to be the current and default implementation anyway, right?

Hi @shaunak,

Your case can indeed be done with the current multiline configuration. My case is a little different. It is when you know the number of lines of an event but there is no clear pattern.

Per example someone has dumped a database table one field per line. In that case you know the number of lines for a row (= number of columns) but creating a pattern for that may be hard. In this situation the configuration can be as follows:

multiline.kind: "merge"
multiline.pattern: ".*"
multiline.match: "before"
multiline.negate: false
multiline.max_lines: 13

where 13 is the number of columns in a row. This will create a single event for a single row. In case you would choose merge-json they would be combined in one JSON-array.

Another use-case is that someone just want to group a set of events that are similar. Per example the application is creating a lot of events and you want to put them in buckets of 300 each so that you can handle such group as a single event. In that case the configuration can be as follows:

multiline.kind: "merge"
multiline.pattern: ".*"
multiline.match: "before"
multiline.negate: false
multiline.max_lines: 300

A side-effect of the merge and merge-json options are that there are no lines discarded.

I hope this clarifies the use-cases.

Ah, thanks so much for clarifying, @williamd67. I think this is definitely worth considering as an enhancement!

If you have a GitHub account, would you mind creating an enhancement issue for this at That way you can be notified of progress and be involved in any conversation with the team. If you prefer, I can do it on your behalf as well.



Hi, I created an enhancement request for this #18038. Thanks for your support.

BTW: I will be able to create a PR for this as well.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.