Problems Migrating to Multiline Codec from Multiline Filter

Maxwell_Flanders · April 28, 2016, 7:11pm

I am currently trying to migrate to the multiline codec from the multiline filter because the filter is not thread-safe (I had been experiencing problems with lost log lines due to this).

However I've been having a problem because of our use case - previously we provided logstash with json input, used the json filter to parse that data into separate fields, and then routed the fields with stack traces through a single mutliline filter. Now that we are using the codec, even with an appropriate regex, it stacks our rows together BEFORE they have been parsed into their own fields. Then, if a document with three json-lines, is fed into the json filter, it simply replaces identical fields with the value of the last row containing that field.

So,

{"host": "host1", "data": "normal_log_row"}
and
{"host": "host1", "data": "stack_trace_line1"}
and
{"host": "host1", "data": "stack_trace_line2"}

fed in separately turn into:

{"host": "host1", "data": "normal_log_row"}
{"host": "host1", "data": "stack_trace_line1"}
{"host": "host1", "data": "stack_trace_line2"}

(single document) after the multilinecodec, and then into:

{"host": "host1", "data": "stack_trace_line2"}

after the json filter, when what I want is:

{"host": "host1", "data": "normal_log_row"}
{"host": "host1", "data": "stack_trace_line1\nstack_trace_line2"}

I do not believe we can go back to the multiline filter after our issues with it, plus its deprecation. Is there a way to get the json filter to append fields with the same name, rather than replacing them??

We need multiline filter capabilities, but it is posing serious problems to only have it accessible as a codec, at the beginning of the filter chain.

Thank you!

guyboertje · May 23, 2016, 9:49am

Do you have a correlation id to group the events?

Maxwell_Flanders · May 23, 2016, 11:45am

I do not, can you tell me more about what a correlation id is??

guyboertje · May 23, 2016, 1:27pm

Any field in each event that would identify each ML event as belonging together.

I am going to suggest that you investigate the aggregate filter but you need to set workers to 1.

guyboertje · May 23, 2016, 1:29pm

What field were you using for the stream_identity in the multiline filter?

guyboertje · May 23, 2016, 1:32pm

How is it that you have JSON objects that each represent parts of a natural (to a human) event?

Maxwell_Flanders · May 23, 2016, 2:57pm

Simply the default valuefor stream identity.

Maxwell_Flanders · May 23, 2016, 3:02pm

Every json was a separate event, fed into logstasb as a stream. Unfortunately this were defined by ROW or log, so each LINE of the stack trace came in a separate event. Previously not a prob because i could then route the stack field alone through the fmultiline filter to correct the issue.
I will investigate this aggregation filter though now. Thenks!!

Topic		Replies	Views
Multiple codecs? running multiline codec during filter? Logstash	5	796	September 3, 2019
How to handle json-formated data in input channel in connection with multiline? Logstash	5	1191	October 15, 2018
File input JSON with multiline codec plugin Logstash	7	3259	December 3, 2019
Mixed JSON input and Other Log files with Multiline codec Logstash	4	1815	July 6, 2017
Multi_line codec vs filter Logstash	1	415	July 6, 2017

Problems Migrating to Multiline Codec from Multiline Filter

Related topics