Problem converting multi-delimited kv string

I'm lost with converting 1 mysql column to ES via logstash. I can convert a single kv to the expected format but not multiple.

Data is in 1 mysql column called "profile". It is going to ES under same field name, mapping type = object.

  1. Keys are open ended, may be different for each record, but schema is same
  2. "=" is defining the key
  3. Everything within [..] are array values for the preceding key
  4. The ";" outside the [...] is indicating end of that value set for the preceding key
  5. The ";" inside the [...] is indicating end of that value

Sample input data in mysql column: profile

record1:
hobbies=[football;basketball;hockey;];interests=[reading;dancing;];food=[burger;pizza;ice-cream;];

record2:
medical=[thyroid;anemia;];food=[chicken;];

Expected result in ES:

record1:

"profile":{
  "hobbies":	["football","basketball","hockey"],
  "interests":	["reading","dancing"],
  "food":	["burger","pizza","ice-cream"]
}

record2:

"profile": {
  "medical":	["thyroid","anemia"],
  "food": 	["chicken"]
}

Does the include_brackets option not help with that?

Only partly. I separated the fields using:

  kv {
    source => "profile"
    include_brackets => false
    field_split_pattern => "\];"
    remove_char_value => "\[\]"
    target => "profile2"
  }

This outputs:

   "profile2": {
    "field1" : "football;basketball;hockey;",
    "field2" : "reading;dancing;"
}

But now for each field inside profile2 have to convert them to an array. These fields are open ended per record, upto n number of fields, so can't hardcode the names. I'm assuming I'll need to run a loop for whatever random fields exist inside profile2 do a split by ; to make them each an array.

It should be:

   "profile2": {
    "field1" : ["football","basketball","hockey"],
    "field2" : ["reading","dancing"]
}

Actually I was suggesting

include_brackets => true
field_split_pattern => ";"

That produces the same output:

   "profile2": {
    "field1" : "football;basketball;hockey;",
    "field2" : "reading;dancing;"
}

But now each value has to become an array for these random fields.

You can use a ruby filter to do the splits. I haven't tested it, but something like

ruby {
    code => '
        event.get("profile2").each { |k, v|
            event.remove("[profile2][#{k}]") # Not sure if this is needed
            event.set("[profile2][#{k}]", v.split(";"))
        }
    '
}

The ruby filter works perfect. Second line not needed. Thanks a lot.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.