Access nested fields in mutate filter

Hello,

I would like to ask a question regarding accessing nested fields in logstash.

There is "processes" object in my JSON which has nested field "Name". I would like to extract the field value from the nested object to separate field.

This is working:
mutate{
add_field => { "Name" => "%{processes[Name]}" }
}

Then suppose that there is also "columns" object in my JSON, which contains names of fields of the "processes" object:
"columns":["Offset(V)","Name","PID","PPID","Thds","Hnds","Sess","Wow64","Start","Exit"]

For accessing the "Name" field of "processes", I would like to use "Name" from "columns" instead of hardcoding it, like in working solution above, so I would like to use something like this, but it is not working:
mutate{
add_field => { "Name" => "%{[processes][columns][1]}" }
}

If I use this, the result in output is:
"Name":"%{[processes][columns][1]}"

If I try this, i get correct name of the field:
mutate{
add_field => { "Name" => "%{columns[1]}" }
}

Is there problem with the syntax? I've tried many combinations of syntax for accessing the field in my mutate filter.

Thanks.

1 Like
add_field => { "Name" => "%{processes[Name]}" }

You're missing one set of square brackets:

add_field => { "Name" => "%{[processes][Name]}" }

If I try this, i get correct name of the field:
mutate{
add_field => { "Name" => "%{columns[1]}" }
}

That seems a bit odd, but it's hard to tell when we don't know exactly what an event looks like. The result of a stdout { codec => rubydebug } output would be useful.

If I use this:
add_field => { "Name" => "%{processes[Name]}" }

then this is the output of file {codec => rubydebug}:

{ "@version" => "1", "@timestamp" => "2016-08-16T20:33:54.994Z", "beat" => { "hostname" => "dc1", "name" => "dc1" }, "count" => 1, "fields" => nil, "input_type" => "log", "offset" => 26224, "source" => "D:\\volatilityOutputs\\pslist.txt", "type" => "psList", "host" => "dc1", "tags" => [ [0] "beats_input_codec_plain_applied" ], "rows" => [ [0] [ [0] 33333738026449876432, [1] "System", [2] 4, [3] 0, [4] 159, [5] 591, [6] -1, [7] 0, [8] "2016-07-14 01:25:11 UTC+0000", [9] "" ], [1] [ [0] 18446738026459611200, [1] "smss.exe", [2] 408, [3] 4, [4] 3, [5] 33, [6] -1, [7] 0, [8] "2016-07-14 01:25:11 UTC+0000", [9] "" ] ], "columns" => [ [0] "Offset(V)", [1] "Name", [2] "PID", [3] "PPID", [4] "Thds", [5] "Hnds", [6] "Sess", [7] "Wow64", [8] "Start", [9] "Exit" ], "processes" => { "Offset(V)" => 33333738026449876432, "Name" => "System", "PID" => 4, "PPID" => 0 }, "Name" => "System" } { "@version" => "1", "@timestamp" => "2016-08-16T20:33:54.994Z", "beat" => { "hostname" => "dc1", "name" => "dc1" }, "count" => 1, "fields" => nil, "input_type" => "log", "offset" => 26224, "source" => "D:\\volatilityOutputs\\pslist.txt", "type" => "psList", "host" => "dc1", "tags" => [ [0] "beats_input_codec_plain_applied" ], "rows" => [ [0] [ [0] 33333738026449876432, [1] "System", [2] 4, [3] 0, [4] 159, [5] 591, [6] -1, [7] 0, [8] "2016-07-14 01:25:11 UTC+0000", [9] "" ], [1] [ [0] 18446738026459611200, [1] "smss.exe", [2] 408, [3] 4, [4] 3, [5] 33, [6] -1, [7] 0, [8] "2016-07-14 01:25:11 UTC+0000", [9] "" ] ], "columns" => [ [0] "Offset(V)", [1] "Name", [2] "PID", [3] "PPID", [4] "Thds", [5] "Hnds", [6] "Sess", [7] "Wow64", [8] "Start", [9] "Exit" ], "processes" => { "Offset(V)" => 18446738026459611200, "Name" => "smss.exe", "PID" => 408, "PPID" => 4 }, "Name" => "smss.exe" }

I'm trying to map "rows" values to "column" values and then create separate event for each row. Finally I want to remove "rows", "columns" and "processes" from each event.

This is what I want to achieve for each "row-column pair (see last line of the results):
"Name" => "System"
or
"Name" => "smss.exe" .

The problem is, that i don't want to achieve it like this:
add_field => { "Name" => "%{processes[Name]}" }

I would like to achieve it like this, but it is not working:
add_field => { "Name" => "%{processes[columns][1]}" }

Thank you very much.

The problem is, that i don't want to achieve it like this:
add_field => { "Name" => "%{processes[Name]}" }

Again, you're missing a set of square brackets. While the above might work, it might be a bug. Use the supported syntax to avoid having things accidentally break.

I would like to achieve it like this, but it is not working:
add_field => { "Name" => "%{processes[columns][1]}" }

You're missing square brackets here too, but the main problem is that there is no [processes][columns] field. The columns field you have is at the top level which is why [columns][1] works.

You need a ruby filter if you want to process all values in the rows field no matter how many they are. Something similar to this should work (but is untested):

ruby {
  code => "
    event['result'] = []
    event['rows'].each { |r|
      event['columns'].zip(r)].each { |values|
        event['result'] << Hash[values]
      }
    }
  "
}

After this the result field should be a list of {"Offset(V)": ..., "Name": ..., ...} objects. Use the split filter to turn them into separate objects.

I've tried it but result of the Hash function is empty. This is what I can see in output:

"result" => [ [ 0] {}, [ 1] {}, [ 2] {}, [ 3] {}, [ 4] {}, [ 5] {}, [ 6] {}, [ 7] {}, [ 8] {}, [ 9] {}, [10] {}, [11] {}, [12] {}, [13] {}, [14] {}, [15] {}, [16] {}, [17] {}, [18] {}, [19] {} ]

The hash value seem to be OK. When I've added event['result_without_hash']<<[values] this was the result:
"result_without_hash=> [ [ 0] [ [0] [ [0] "Offset(V)", [1] 33333738026449876432 ] ], [ 1] [ [0] [ [0] "Name", [1] "System" ] ], ...

Do you know why the result of Hash[values] is empty?

I've tried to achieve something similar with this ruby script and it was working:
event['processes'] = event['rows'].collect.collect { |process| { event['columns'][0]=>process[0], event['columns'][1]=>process[1], event['columns'][2]=>process[2], event['columns'][3]=>process[3] } }

You can see the resulting "processes" object in my previous post. The problem is that I had to use indexes 0, 1, 2 etc. to access the field elements. The better option would be to use for loop and use loop variable as index like this, but it is not working:
event['processes'] = event['rows'].collect.collect { |process| { for i in 0..9 event['columns'][i]=>process[i] end } }
Is there any possibility how to iterate through the arrays using index?