Problems Migrating to Multiline Codec from Multiline Filter

I am currently trying to migrate to the multiline codec from the multiline filter because the filter is not thread-safe (I had been experiencing problems with lost log lines due to this).

However I've been having a problem because of our use case - previously we provided logstash with json input, used the json filter to parse that data into separate fields, and then routed the fields with stack traces through a single mutliline filter. Now that we are using the codec, even with an appropriate regex, it stacks our rows together BEFORE they have been parsed into their own fields. Then, if a document with three json-lines, is fed into the json filter, it simply replaces identical fields with the value of the last row containing that field.

So,

{"host": "host1", "data": "normal_log_row"}
and
{"host": "host1", "data": "stack_trace_line1"}
and
{"host": "host1", "data": "stack_trace_line2"}

fed in separately turn into:

{"host": "host1", "data": "normal_log_row"}
{"host": "host1", "data": "stack_trace_line1"}
{"host": "host1", "data": "stack_trace_line2"}

(single document) after the multilinecodec, and then into:

{"host": "host1", "data": "stack_trace_line2"}

after the json filter, when what I want is:

{"host": "host1", "data": "normal_log_row"}
{"host": "host1", "data": "stack_trace_line1\nstack_trace_line2"}

I do not believe we can go back to the multiline filter after our issues with it, plus its deprecation. Is there a way to get the json filter to append fields with the same name, rather than replacing them??

We need multiline filter capabilities, but it is posing serious problems to only have it accessible as a codec, at the beginning of the filter chain.

Thank you!

Do you have a correlation id to group the events?

I do not, can you tell me more about what a correlation id is??

Any field in each event that would identify each ML event as belonging together.

I am going to suggest that you investigate the aggregate filter but you need to set workers to 1.

What field were you using for the stream_identity in the multiline filter?

How is it that you have JSON objects that each represent parts of a natural (to a human) event?

Simply the default valuefor stream identity.

Every json was a separate event, fed into logstasb as a stream. Unfortunately this were defined by ROW or log, so each LINE of the stack trace came in a separate event. Previously not a prob because i could then route the stack field alone through the fmultiline filter to correct the issue.
I will investigate this aggregation filter though now. Thenks!!