Problem converting multi-delimited kv string

I'm lost with converting 1 mysql column to ES via logstash. I can convert a single kv to the expected format but not multiple.

Data is in 1 mysql column called "profile". It is going to ES under same field name, mapping type = object.

  1. Keys are open ended, may be different for each record, but schema is same
  2. "=" is defining the key
  3. Everything within [..] are array values for the preceding key
  4. The ";" outside the [...] is indicating end of that value set for the preceding key
  5. The ";" inside the [...] is indicating end of that value

Sample input data in mysql column: profile



Expected result in ES:


  "hobbies":	["football","basketball","hockey"],
  "interests":	["reading","dancing"],
  "food":	["burger","pizza","ice-cream"]


"profile": {
  "medical":	["thyroid","anemia"],
  "food": 	["chicken"]

Does the include_brackets option not help with that?

Only partly. I separated the fields using:

  kv {
    source => "profile"
    include_brackets => false
    field_split_pattern => "\];"
    remove_char_value => "\[\]"
    target => "profile2"

This outputs:

   "profile2": {
    "field1" : "football;basketball;hockey;",
    "field2" : "reading;dancing;"

But now for each field inside profile2 have to convert them to an array. These fields are open ended per record, upto n number of fields, so can't hardcode the names. I'm assuming I'll need to run a loop for whatever random fields exist inside profile2 do a split by ; to make them each an array.

It should be:

   "profile2": {
    "field1" : ["football","basketball","hockey"],
    "field2" : ["reading","dancing"]

Actually I was suggesting

include_brackets => true
field_split_pattern => ";"

That produces the same output:

   "profile2": {
    "field1" : "football;basketball;hockey;",
    "field2" : "reading;dancing;"

But now each value has to become an array for these random fields.

You can use a ruby filter to do the splits. I haven't tested it, but something like

ruby {
    code => '
        event.get("profile2").each { |k, v|
            event.remove("[profile2][#{k}]") # Not sure if this is needed
            event.set("[profile2][#{k}]", v.split(";"))

The ruby filter works perfect. Second line not needed. Thanks a lot.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.