Multiline Plugin - metadata missing from last line

Hi,

I am trying to combine multiple lines into one event using the multiline codec. I also need some filename metadata. The lines are combining properly. However, the metadata of the filename is lost for the last event read from a file.

Here is my configuration:

input {
        s3{
                bucket => "bucket_name"
		region => "us-east-2"
		codec => multiline {
					pattern => "^(%{DATESTAMP})"
					negate => "true"
					what => "previous"
		}
        }
}
filter {
	mutate { add_field => { "file_name" => "%{[@metadata][s3][key]}"}}
}
output{
		stdout { codec => rubydebug }
}

The sample input (sampleLog.txt):

06-19-2018 15:25:35.7046|ERROR
	more info...
06-19-2018 15:25:35.7046|DEBUG
	more info...
06-19-2018 15:25:35.7046|INFO
	more info...

And the logstash output:

{
    "@timestamp" => 2018-06-20T14:41:09.998Z,
       "message" => "06-19-2018 15:25:35.7046|ERROR\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "sampleLog.txt"
}
{
    "@timestamp" => 2018-06-20T14:41:09.998Z,
       "message" => "06-19-2018 15:25:35.7046|DEBUG\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "sampleLog.txt"
}
{
    "@timestamp" => 2018-06-20T14:41:09.999Z,
       "message" => "06-19-2018 15:25:35.7046|INFO\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "%{[@metadata][s3][key]}"
}

I have noted that the filename metadata is missing whether or not the final line was part of a multiline event. Also, I see that if I remove the multiline codec from my configuration, the filename metadata appears for the final line (but then obviously multiline events are not combined).

Any help is much appreciated!

Update: It appears that my problem might have something to do with the use of "what => "previous"". If I change "previous" to "next", the last line will include metadata. However, this is not the proper grouping I need for multiline events.

It looks like a bug to me. Looking at the source, in the main loop of reading the file it sets [metadata][s3][key] on each event before it pushes it onto the queue (line 212). But after the main loop completes, if it flushes the codec then it does not add that key before queueing the event (line 220).

Thanks! I have opened a GitHub issue for this:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.