Kv Filter - allow_duplicate_values different behavior

Hi guys,

Thanks for taking the time to read these lines,
I've got a question regarding kv filter and its allow_duplicate_values attribute
Assuming I have a log that looks like :

key1=value1|key2=value2|key3=value3|key1=value1|key2=value2|key3=value3

At the first place, it thought I always had keyX=valueX (most of the time the case)
So by using :

field_split => "|"
value_split => "="
allow_duplicate_values => false

I got the below :

{
  key1 : "value1",
  key2 : "value2",
  key3 : "value3"
}

that was fine until I realized that sometimes I can have a value4 in my 2nd key1
I'd like to understand why in this case my output looks like :

 {
  key1 : "value1,value4",
  key2 : "value2",
  key3 : "value3"
}

1/ How come ? I thought allow_duplicate_values=false would have kept either value1 or value4 but not both
2/ Actually, it would appear that I'm happy with this 2nd output, but is there a way to change the separator , and use : ?

Thank you
Guillaume

  1. No, if you have

    from=Badger from=Badger
    

    then if you have allow_duplicate_value=true then you will get

     from: [ "Badger", "Badger" ]
    

    if you have allow_duplicate_value=false then you will get

     from: "Badger"
    

    If the values are different you always get both.

  2. key1 should be an array. Are you converting it to string? If you want to change the separator in the string then use mutate+gsub.

Hi @Badger
Thanks for the reply,

1/ Sorry, what do you mean by "you always get both". In which format would I get both values ? from: "Badger,Badger" ?

2/ Downstream in the same logstash conf I'm doing :

add_field => { "a_new_key" => "%{key2}:%{key1}" }

So it was fine until I start seeing a few : a_new_key: "value2:value1,value4" (taking back my example of my post)
Do you think the conversion to string would have added the comma ?

I'm ideally looking for a way to get a_new_key: "value2:value1" when I have twice key1=value1 in my logs and a_new_key: "value2:value1:value4" when I have key1=value1|...|key1=value4

Thanks
Guillaume

When I say you always get both I mean if the two values are different, like

from=Badger from=GitsBdr

Yes, doing the add_field with a sprintf reference to an array would do a to_s on it, which would put join the members using comma as a separator.

mutate { gsub => [ "message", ",", ":" ] }

Morning,

Lovely ! gsub does the trick perfectly thanks !

If I can bother you with one more question, how would Elasticsearch react to from=Badger from=GitsBdr ? It would first create the document with a field from and the value Badger and then update it with the value GitsBdr ?
And Logstash ? I quite don't get how would you separately manipulate these 2 values if they have the same key name !

thanks
Guillaume

In logstash kv produces an array by default if there are multiple occurrences of a key. If given an array elasticsearch will maintain it as an array.

Alright makes sense. Thanks a lot :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.