Kv filter: empty value after key yields strange behavior

Logstash version 6.3.0

I'm using the kv filter with the default settings, which means:

value_split => "="
field_split => " "

If I send a message with the format of key= without a value on the other side, it interprets the following key-value as the value for that key. Meaning, key= key2=value is interpreted the same as key="key2=value". You can replicate this behavior by running Logstash with

bin/logstash  -e "input{ stdin{} } filter{ kv{} } output{ stdout{} }"

and then try the following inputs through stdin:

All keys w/ values (pass)

input: key1=value1 key2=value2 key3=value3
output:

{
      "@version" => "1",
          "key1" => "value1",
       "message" => "key1=value1 key2=value2 key3=value3\r"
          "host" => "logstash",
    "@timestamp" => 2018-08-03T19:16:11.129Z,
          "key2" => "value2",
          "key3" => "value3\r"
}

second key missing value (fail)

input: key1=value1 key2= key3=value3
output:

{
      "@version" => "1",
          "key1" => "value1",
       "message" => "key1=value1 key2= key3=value3\r",
          "host" => "logstash",
    "@timestamp" => 2018-08-03T19:16:20.520Z,
          "key2" => "key3=value3\r"
}

first key missing value (fail)

input: key1= key2=value2 key3=value3
output:

{
      "@version" => "1",
          "key1" => "key2=value2",
       "message" => "key1= key2=value2 key3=value3\r",
          "host" => "logstash",
    "@timestamp" => 2018-08-03T19:16:34.751Z,
          "key3" => "value3\r"
}

first key with empty quote (pass)

input: key1="" key2=value2 key3=value3
output:

{
      "@version" => "1",
       "message" => "key1=\"\" key2=value2 key3=value3\r",
          "host" => "logstash",
    "@timestamp" => 2018-08-03T19:16:49.252Z,
          "key2" => "value2",
          "key3" => "value3\r"
}

last key with missing value (pass)

input: key1=value1 key2=value2 key3=
output:

{
      "@version" => "1",
          "key1" => "value1",
       "message" => "key1=value1 key2=value2 key3=\r",
          "host" => "logstash",
    "@timestamp" => 2018-08-03T19:17:34.828Z,
          "key2" => "value2"
}

Is there a way to tell the kv filter that an empty space after the = means the key has a null value?

Thanks

You cannot have empty values. But you could do something like

mutate { gsub => [ "message", "= ", "='' " ] }
1 Like

The KV filter plugin now has a whitespace => "strict" mode, which was recently added to address this issue.

The original parser was too lenient with whitespace, and was unable to differentiate an empty unquoted value from an optional whitespace that occurred before a value.

You may need to update the plugin.

Thank you, that worked! I didn't think about replacing with quotes.

I have another problem, (I'm trying to capture logs from a firewall), which is that sometimes the messages I am receiving do not encapsulate the values in quotes if they contain spaces, which is also the field split, but the keys are always whole words.

an example would be a message that looks like this:

key1=value key2=complex multiword value key3=anothervalue

Does gsub allow you to replace capture groups? If so, I could make a regular expression that captures spaced-values and replaces the value with quotes surrounding it.

Yes, it support capture groups.

mutate { gsub => [ "message", "(foo)bar(baz)", "\1 or \2" ] }

would get you "foo or baz"

OR, if your keys are predictable enough, you could use the KV Filter Plugin's field_split_pattern directive and a regular expression lookahead to only split fields by spaces that come before keys:

I've crafted a pattern for you with tests here -> http://rubular.com/r/08Ol37nQeG

For example, if my keys always were alphanumeric and matched [A-Za-z0-9]+, I would define the pattern with a lookahead like so:

    field_split_pattern => " (?=[A-Za-z0-9]+=)"

I'll break that expression up and explain what is going on:

Split on any

  • space
  • (?=: that is followed by:
    • [A-Za-z0-9]+: a sequence of 1 or more alphanumeric characters then
    • = an equals sign
    • ) end lookahead
2 Likes

Thanks! This seems like a better solution than the road I was going down. I was attempting to capture the whole message with gsub and do a replace with this regular expression to put quotes around the values -> http://rubular.com/r/HsNA8E2kDR

regex > (\w+)=([^=]*)(?=\s)
subsitution > \1="\2"

But I would have had to force a space at the end of the message.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.