I am currently trying to migrate to the multiline codec from the multiline filter because the filter is not thread-safe (I had been experiencing problems with lost log lines due to this).
However I've been having a problem because of our use case - previously we provided logstash with json input, used the json filter to parse that data into separate fields, and then routed the fields with stack traces through a single mutliline filter. Now that we are using the codec, even with an appropriate regex, it stacks our rows together BEFORE they have been parsed into their own fields. Then, if a document with three json-lines, is fed into the json filter, it simply replaces identical fields with the value of the last row containing that field.
So,
{"host": "host1", "data": "normal_log_row"}
and
{"host": "host1", "data": "stack_trace_line1"}
and
{"host": "host1", "data": "stack_trace_line2"}
fed in separately turn into:
{"host": "host1", "data": "normal_log_row"}
{"host": "host1", "data": "stack_trace_line1"}
{"host": "host1", "data": "stack_trace_line2"}
(single document) after the multilinecodec, and then into:
{"host": "host1", "data": "stack_trace_line2"}
after the json filter, when what I want is:
{"host": "host1", "data": "normal_log_row"}
{"host": "host1", "data": "stack_trace_line1\nstack_trace_line2"}
I do not believe we can go back to the multiline filter after our issues with it, plus its deprecation. Is there a way to get the json filter to append fields with the same name, rather than replacing them??
We need multiline filter capabilities, but it is posing serious problems to only have it accessible as a codec, at the beginning of the filter chain.
Thank you!