Flatten json and create dynamic field

I know ruby will be involve I have try few code that I found on forum. but didn't work out

here is my input

message => '{"jobid":348332180, "runtime":{"cputime": 24, "totalelapsed": 105},"dataflow":[{"gate": "v204.06", "attrmath":"v206.07","keep": "lib"}]}'

dataflow will have different name different value for each message. upto 150 different name
like gate, attrmath, keep, in, out etc..

would like to create document something like this, means creating field dynamically what ever appears in dataflow array.

{
    jobid:348332180
    runtime_cputime:24
    runtime_totalelapsed:105
   gate_name: gate
   gate_version: v204.06
   attrmath_name: attrmath
   attrmath_version: v206.07
   keep_name: keep
   keep_version: lib
}

in end I might have 200 field in my index but I think better then nested structure.

Hey,

Will the json plugin do the job ?

I'm not sure if understand the output you wanted but you should try something like

filter{ json { source => "message" } }

And the field will be accessible trough [jobid] or [runtime][cputime]

If you want to extract field and just add the content without nested fields just rename them.

all that is been done. but output is like this

{
      "dataflow" => [
        [0] {
                "keep" => "lib",
                "gate" => "v204.06",
            "attrmath" => "v206.07"
        }
    ],
      "@version" => "1",
    "@timestamp" => 2020-07-08T14:25:58.605Z,
                "jobid" => 348332180,
       "runtime" => {
             "cputime" => 24,
        "totalelapsed" => 105
    }
}

but I want this dataflow array to be flatten out.

used this code but didn't work

if [dataflow] {
   ruby {
     code => '
         kv = event.get("dataflow")
          kv.to_hash.each { |k,v|
         event.set(k, v)
      }
   '
   }
}

Have you tried to KV from logstash ?

You can play with fields and logstash

filter{
  json { source => "message" 
  target => "json"
  } 
}
# changing nested fields into top level fields
mutate{
rename =>{"[json][jobid]" => "jobid"}
rename =>{"[json][runtime]" => "runtime"}
}
# Extracting double nested field for kv
kv {
  source => [json][dataflow]
}
mutate{ remove_field => ["json"] }

This will be pretty ok i assume ?

Can i ask you why it is a problem to keep nested fields ?

You'll have to find a way to dynamically rename unknown fields ?

this index can go with many million record and don't want nested field as it will cause more problem then not

hope I will wait for someone to give some path on ruby code for this

With "flatten out" you mean transfer those fields to the root level of your event, right? Then your code is close to the solution. But dataflow is an array with the actual data in the first entry. And you need to remove the field afterwards. I think it should be like this:

if [dataflow] {
   ruby {
     code => '
         event.get("[dataflow][0]").each { |k,v|
           event.set(k, v)
         }
         event.remove("dataflow")
   '
   }
}
2 Likes

Very good Thank you it is working

final code

if [dataflow] {
   ruby {
     code => '
         event.get("[dataflow][0]").each { |k,v|
           event.set("#{k}_name", k)
           event.set("#{k}_version", v)
         }
         event.remove("dataflow")
   '
   }
}
if [runtime] {
   ruby {
      code => '
          event.get("[runtime]").each { |k,v|
          event.set(k,v)
          }
          event.remove("runtime")
      '
    }
}

output something like this

   "gate_name" => "gate",
   "gate_version" => "v204.06",

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.