Bug: KV filter does not handle quoted strings with spaces at the end of a line, when using windows line endings

I've come accross an issue with the kv-filter plugin where if I input a file with windows line endings and the last key value pair on the line is a string with spaces, the string is not read correctly, but is instead read up until the first space, including the leading quotation mark. This only happens on lines that end with a line ending, so if there is no new line after the last line, this line will be read correctly.

  • Version: Logstash 7.5.0, KV-filter 4.4.0
  • Operating System: Debian 10
  • Config File:

input {
file {
path => "[path to input file]"
mode => "read"
start_position => "beginning"
}
}
filter {
kv {
target => "log_params"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
}

  • Sample Data:

key1="string one one" key2=123 key3="string one two"
key1="string one two" key2=321 key3="string two two"

  • Steps to Reproduce:
    Create two files with the above sample data, one with Unix line endings and one with Windows line endings. The Unix line endings file returns (only printing output of the first line to keep it short):

log_params.key1: string one one
log_params.key2: 123
log_params.key3: string one two

The Windows file endings file returns:

log_params.key1: string one one
log_params.key2: 123
log_params.key3: "string

edit1: typo

didyou try telling the kv filter to use = as the key-value separator?

I don't think that is a bug, quoted fields have to start and end with quotes. So instead of taking "string one two" as a quoted field it consumes just the first word. Then the rest of the words do not have value separators, so they are dropped.

You can use

field_split => "^M "

(where ^M is a literal carriage return) to cause it to be a quoted string again.

Thanks for the clarification. I tried adding the field split, which worked on the test data. On my real world data it worked better than before, meaning a lot of the lines that were wrong before are now right. However, for some reason it didn't work on all lines. I'm not sure what other weird characters might have sneaked in on those lines, but in the end I just added a mutate filter and replaced every carriage return with nothing. This worked on every line.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.