Logstash Extract Elasticsearch Nested Fields

Hello Badger

Thanks for your response.

Let me put it again properly ...

my logstash conf file is below one :

My Input if Elasticsearch Index version 7.8.0

input {
  elasticsearch {
    hosts => "10.10.10.10:9200"
    query => '{"_source" : ["userId", "timeStamp", "backupSettings.backgroundUploading", "backupSettings.connectionType", "backupSettings.contactBackup", "backupSettings.contactPermission", "backupSettings.photoBackup", "backupSettings.photoQuality", "backupSettings]storagePermission", "backupSettings.videoBackup", "appEvents.attribute_num.filesPendingForUpload","appEvents.attribute_num.filesUploadedSinceLastEvent"],"query" : { "match_all": {} }}'
    size => 10000
    scroll => "5m"
    index => "backupindex"
  }
}

Now I have filter added to take multi-nested fields into single new field.

filter {
    mutate {
        add_field => { "filesPendingForUpload" => "%{[[appEvents][attribute_num]][filesPendingForUpload]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[[appEvents][attribute_num]][filesUploadedSinceLastEvent]}" }
    }
}

Now output goes to CSV File :

output {
  csv {
    fields => ["userId", "timeStamp", "[backupSettings][backgroundUploading]", "[backupSettings][connectionType]", "[backupSettings][contactBackup]", "[backupSettings][contactPermission]", "[backupSettings][photoBackup]", "[backupSettings][photoQuality]", "[backupSettings][storagePermission]", "[backupSettings][videoBackup]", "[filesPendingForUpload]", "[filesUploadedSinceLastEvent]"]
    path => "/tmp/exp.csv"
  }
}

Still the implementation is unclear from my side in above explanation ?

Hi

Could some one please help me ..

IF still my description of my issue is incomplete , please do let me know .. I will fill up the gap...

The output suggests that these are not working. Why do you have a second set of square brackets around [appEvents][attribute_num]?

Hi Badger

I tried with below options :

add_field => { "filesPendingForUpload" => "%{[appEvents][attribute_num][filesPendingForUpload]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][attribute_num][filesUploadedSinceLastEvent]}" }
add_field => { "filesPendingForUpload" => "%{[[appEvents][attribute_num]][filesPendingForUpload]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[[appEvents][attribute_num]][filesUploadedSinceLastEvent]}" }
add_field => { "filesPendingForUpload" => "%{[appEvents][[attribute_num][filesPendingForUpload]]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][[attribute_num][filesUploadedSinceLastEvent]]}" }

I came across some forums : for multi nested path use [] double square brackets ...

I tried above all 3 options , but none worked...

Could you please help me to achieve data of , below fields in logstash CSV output plugin. How to mention such fields ..

appEvents.attribute_num.filesPendingForUpload 
appEvents.attribute_num.filesUploadedSinceLastEvent

I also tried below :

  1. specifying "[appEvents][0][attribute_num][filesPendingForUpload]"
input {
  elasticsearch {
    hosts => "10.10.101.1:9200"
    size => 10000
    index => "backupindex"
  }
}



output {
  csv {
    fields => ["userId", "timeStamp", "[backupSettings][backgroundUploading]", "[backupSettings][connectionType]", "[backupSettings][contactBackup]", "[backupSettings][contactPermission]", "[backupSettings][photoBackup]", "[backupSettings][photoQuality]", "[backupSettings][storagePermission]", "[backupSettings][videoBackup]", "[appEvents][0][attribute_num][filesPendingForUpload]"]
    path => "/tmp/exp.csv"
  }
}
  1. specifying "[appEvents][attribute_num][filesPendingForUpload]"

Both did not helped me ... IN CSV , field was blank.

Can you replace your output with

output { stdout { codec => rubydebug } }

and show us what [appEvents] looks like?

Okay Sure ..

{
         "timeStamp" => 1623203744873,
        "@timestamp" => 2021-06-15T04:55:11.613Z,
          "@version" => "1",
            "userId" => "c0135af5feb34a77baee92b7551c4d06",
         "appEvents" => [
        [0] {
            "attribute_num" => {
                "filesUploadedSinceLastEvent" => 0,
                      "filesPendingForUpload" => 2
            }
        }
    ],
    "backupSettings" => {
             "connectionType" => "WLC",
        "backgroundUploading" => "Y",
          "contactPermission" => "Y",
                "photoBackup" => "CAM",
               "photoQuality" => "Original",
                "videoBackup" => "OFF",
              "contactBackup" => "Y",
          "storagePermission" => "Y"
    }
}

[appEvents] is an array, so you should be using "[appEvents][0][attribute_num][filesPendingForUpload]"

Hello Badger

I have already tried using this ...

Could you please have a look over this post ..

I do not know what to say. Both the mutate+add_field and the csv output will event.get "[appEvents][0][attribute_num][filesPendingForUpload]". If that field exists it should be expanded.

Are you getting this for every event?

Hi Badger

I did not used mutate this time ...

input {
  elasticsearch {
    hosts => "10.10.101.10:9200"
    size => 10000
    index => "backup"
  }
}



output {
  csv {
    fields => ["userId", "timeStamp", "[backupSettings][backgroundUploading]", "[backupSettings][connectionType]", "[backupSettings][contactBackup]", "[backupSettings][contactPermission]", "[backupSettings][photoBackup]", "[backupSettings][photoQuality]", "[backupSettings][storagePermission]", "[backupSettings][videoBackup]", "[appEvents][0][attribute_num][filesUploadedSinceLastEvent]", "[appEvents][0][attribute_num][filesPendingForUpload]"]
    path => "/tmp/exp.csv"
  }
}

Output :

6018a66855b6471ebf976b2d551c9731,1623241661046,Y,WLC,Y,Y,CAM,Original,Y,OFF,,

You can see last both fields are empty ...

they should contain data

Could you please help me with what I am missing because of which my fields are not getting captured in csv file..

There is one hope now ....

I used mutate fields ... and i got the record successfully.

Only problem is now if event is empty .. then it prints entire mutate field.

Could you please suggest ways to avoid this...

input {
  elasticsearch {
    hosts => "10.10.100.10:9200"
    query => '{"_source" : ["userId", "timeStamp", "backupSettings.backgroundUploading", "backupSettings.connectionType", "backupSettings.contactBackup", "backupSettings.contactPermission", "backupSettings.photoBackup", "backupSettings.photoQuality", "backupSettings.storagePermission", "backupSettings.videoBackup", "appEvents.attribute_num.filesPendingForUpload","appEvents.attribute_num.filesUploadedSinceLastEvent"],"query" : { "match_all": {} }}'
    size => 10000
    scroll => "5m"
    index => "backup"
  }
}



filter {
    mutate {
        add_field => { "filesPendingForUpload" => "%{[appEvents][0][attribute_num][filesPendingForUpload]}" }
        add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}" }
    }
}

output {
  csv {
    fields => ["userId", "timeStamp", "[backupSettings][backgroundUploading]", "[backupSettings][connectionType]", "[backupSettings][contactBackup]", "[backupSettings][contactPermission]", "[backupSettings][photoBackup]", "[backupSettings][photoQuality]", "[backupSettings][storagePermission]", "[backupSettings][videoBackup]", "filesPendingForUpload", "filesUploadedSinceLastEvent"]
    path => "/tmp/exp.csv"
  }
}


6018a66855b6471ebf976b2d551c9731,1623241661046,Y,WLC,Y,Y,CAM,Original,Y,OFF,8,173
6018a66855b6471ebf976b2d551c9731,1623256154896,,,,,,,,,%{[appEvents][0][attribute_num][filesPendingForUpload]},%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}
6018a66855b6471ebf976b2d551c9731,1623254680762,,,,,,,,,%{[appEvents][0][attribute_num][filesPendingForUpload]},%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}

Possibly use a prune filter where the default value for blacklist_names will delete fields where a sprintf reference failed. I think the unresolved sprintf reference will be top-level so then you should get an empty field.

Thank You for your response.

I will check and reply in here ...

Hello Badger

I tried below ways .. but it did not helped me ...
I am sure , my writing of prune filter is not correct .. .hence its not working properly as expected ..

Could you please help me with its syntax to be mentioned ...

filter {
    mutate {
        add_field => { "filesPendingForUpload" => "%{[appEvents][0][attribute_num][filesPendingForUpload]}" }
        add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}" }
    }
    prune {
        blacklist_names => [ "filesPendingForUpload", "filesUploadedSinceLastEvent" ]
    }
}

filter {
    mutate {
        add_field => { "filesPendingForUpload" => "%{[appEvents][0][attribute_num][filesPendingForUpload]}" }
        add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}" }
    }
    prune {
        blacklist_names => [ "filesPendingForUpload", "filesUploadedSinceLastEvent" ]
        blacklist_values => [ "" , "" ]
    }
}

I also tried below :

filter {
    mutate {
        add_field => { "filesPendingForUpload" => "%{[appEvents][0][attribute_num][filesPendingForUpload]}" }
        add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}" }
    }
    prune {
        blacklist_values => [ "filesPendingForUpload" ,"" , "filesUploadedSinceLastEvent", "" ]
    }
}

I would like to high light in here that for some documents we have this field missing ....

for example below is the set of document does not have
[appEvents][0][attribute_num][filesUploadedSinceLastEvent] and [appEvents][0][attribute_num][filesPendingForUpload]

{
        "userId" => "6018a66855b6471ebf976b2d551c9731",
    "@timestamp" => 2021-06-16T13:40:32.584Z,
      "@version" => "1",
     "timeStamp" => 1623256154896
}

In such cases , output is like below in csv file

6018a66855b6471ebf976b2d551c9731,1623256154896,,,,,,,,,%{[appEvents][0][attribute_num][filesPendingForUpload]},%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}
6018a66855b6471ebf976b2d551c9731,1623254680762,,,,,,,,,%{[appEvents][0][attribute_num][filesPendingForUpload]},%{[appEvents][0][attribute_num][filesUploadedSinceLastEvent]}