Logatash with nested Json parser

i have kafka input

something like this

{"interactionId":"7891013","applicantInfoList":[{"cif":123456,"role":"Primary","applicantType":"PROSPECT"}],"correlationId":"d828bcf9-0fb7-467b-b2c5-a67c2305a9db","applicationId":"99999","productType":4600,"timestamp":1443713482677}

logstash configuration

input {
file {
path => "/Users/xxx/Desktop/ELK/ELK Infrastructure/kafka_xxx_sample_data.txt"
start_position => "beginning"
}
}

filter {
json {
source => "message"
}
if [applicantInfoList][cif] == 123456 {
mutate {
add_field => {
"new_field" => "Application Accepted"
}
}
}
else if [applicantInfoList][cif] == 7891011 {
mutate {
add_field => {
"new_field" => "Application Rejected"
}
}
}
}

output {
elasticsearch {
action => "index"
hosts => "127.0.0.1:9200"
codec => "plain"
flush_size => 5000
idle_flush_time => 1
index => "logstash-%{+YYYY.MM.dd}"
}
stdout { codec => rubydebug }
}

Issue : it is not recognizing nested field [applicantInfoList][cif] because of [ ]

but if i remove [] from input it is able to recognize nested field cif.

how would i solve it without removing it [].

applicantInfoList contains an array of objects so [applicantInfoList][cif] won't work. You'd have to indicate which of the elements in the array you want to access. For example, the first element should be accessible via [applicantInfoList][0][cif].

Thanks

Hi

example

{"interactionId":"7891013","applicantInfoList":[{"cif":123456,"role":"Primary","applicantType":"PROSPECT"},{"cif":123456,"role":"Primary","applicantType":"PROSPECT"},{"cif":123456,"role":"Primary","applicantType":"PROSPECT"}],"correlationId":"d828bcf9-0fb7-467b-b2c5-a67c2305a9db","applicationId":"99999","productType":4600,"timestamp":1443713482677}

{"interactionId":"7891013","applicantInfoList":[{"cif":123456,"role":"Primary","applicantType":"PROSPECT"},{"cif":123456,"role":"Primary","applicantType":"PROSPECT"}],"correlationId":"d828bcf9-0fb7-467b-b2c5-a67c2305a9db","applicationId":"99999","productType":4600,"timestamp":1443713482677}

it could be two, three or unknown nested data.

if i have repetitive nested data is there a way to do it dynamically instead of hardcoding [applicantInfoList][0][cif] or [applicantInfoList][1][cif] ......?

Yes, but you'll have to use a ruby filter. I'm afraid I don't have time to write up an example.

i not able to figure out deep nested json

example :

{"interactionId":"7891013","applicantInfoList":[{"pri_cif":123456,"role":"Primary","applicantType":"PROSPECT"},{"sec_cif":123456,"role":"secondary","applicantType":"PROSPECT","homeAddress" => {
"addressType" => "RESIDENCE",
"addressLineOne" => "3715 W. Trevino",
"addressLineTwo" => "a",
"city" => "Hobbs",
"state" => "NM",
"zip" => "88240",
"country" => "US"
}}],"correlationId":"d828bcf9-0fb7-467b-b2c5-a67c2305a9db","applicationId":"99999","productType":4600,"timestamp":1443713482677}

this is not working

filter {
ruby {
code => "event['applicantInfoList'][homeAddress].each { |x| x.delete('addressType') .each { |x| x.delete('addressLineOne')}"
}
}

any suggestion on how to delete deep nested fields ?

i used this and worked fine

filter {
ruby {
code => "event['applicantInfoList'].each { |x| if x['homeAddress']; x['homeAddress'].delete('addressType'); x['homeAddress'].delete('addressLineOne') end }"
}
}

Thanks for replying with you example! :slight_smile: