Logstash Extract Elasticsearch Nested Fields

Hello Team

I am using Elasticsearch version 7.8.0 and Logstash version 7.8.0

I am having ES Index having following mappings :

    "mappings" : {
      "properties" : {
        "appEvents" : {
          "properties" : {
            "attribute" : {
              "properties" : {
                "filesPendingForUpload" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "filesUploadedSinceLastEvent" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "launch" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },

Now i want to extract the data from ES Index to CSV file using logstash.

In logstash conf file I have specified

output {
  csv {
    fields => ["[appEvents][attribute][filesPendingForUpload]"]
    path => "/tmp/exp.csv"
  }
}

But I am not getting respective field data in csv.

As per my understanding, I am not writing fields in proper format...

Could you please help me in getting data into csv.

Thank You.

I meant to ask that .. how to mention fields having multi-nested ones in logstash ...

Hello Team

I tried using filter = mutate = for merging multiple fields into single field ...

filter {
    mutate {
        add_field => { "filesPendingForUpload" => "%{[appEvents][attribute_num][filesPendingForUpload]}" }
        add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][attribute_num][filesUploadedSinceLastEvent]}" }
  }
}

But my output contains :

6018a66855b6471ebf976b2d551c9731,1623241661046,Y,WLC,Y,Y,CAM,Original,,OFF,%{[appEvents][attribute_num][filesPendingForUpload]},%{[appEvents][attribute_num][filesUploadedSinceLastEvent]}

Below data i want to extract using logstash into CSV file ...

I tried multiple ways , but not able to achieve it...

Could someone please help me ...

"appEvents":[{"attribute_num":{"filesPendingForUpload":8,"filesUploadedSinceLastEvent":173}}]```

can someone please help me how to mentioned deep nested fields in logstash

output plugin of csv : FIELDS

Can someone please help me ...

Not easily. You have shown that you have an elasticsearch mapping. That suggests that you have an index somewhere. It gives no indication of what kind of events the logstash input is producing.

The output plugin configuration you show would produce a single column CSV, but you say your output contains multiple columns.

sprintf references like %{[appEvents][attribute_num][filesUploadedSinceLastEvent]} showing up in the output would be consistent with that field not existing and you referencing [filesUploadedSinceLastEvent] in the csv filter after an event went through that mutate filter.

Basically it is impossible for what you describe to be happening, and we cannot guess which parts of your description are wrong.

We are happy to help, but you have to help us to help you.

Hello Badger

Thanks for your response.

Let me put it again properly ...

my logstash conf file is below one :

My Input if Elasticsearch Index version 7.8.0

input {
  elasticsearch {
    hosts => "10.10.10.10:9200"
    query => '{"_source" : ["userId", "timeStamp", "backupSettings.backgroundUploading", "backupSettings.connectionType", "backupSettings.contactBackup", "backupSettings.contactPermission", "backupSettings.photoBackup", "backupSettings.photoQuality", "backupSettings]storagePermission", "backupSettings.videoBackup", "appEvents.attribute_num.filesPendingForUpload","appEvents.attribute_num.filesUploadedSinceLastEvent"],"query" : { "match_all": {} }}'
    size => 10000
    scroll => "5m"
    index => "backupindex"
  }
}

Now I have filter added to take multi-nested fields into single new field.

filter {
    mutate {
        add_field => { "filesPendingForUpload" => "%{[[appEvents][attribute_num]][filesPendingForUpload]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[[appEvents][attribute_num]][filesUploadedSinceLastEvent]}" }
    }
}

Now output goes to CSV File :

output {
  csv {
    fields => ["userId", "timeStamp", "[backupSettings][backgroundUploading]", "[backupSettings][connectionType]", "[backupSettings][contactBackup]", "[backupSettings][contactPermission]", "[backupSettings][photoBackup]", "[backupSettings][photoQuality]", "[backupSettings][storagePermission]", "[backupSettings][videoBackup]", "[filesPendingForUpload]", "[filesUploadedSinceLastEvent]"]
    path => "/tmp/exp.csv"
  }
}

Still the implementation is unclear from my side in above explanation ?

Hi

Could some one please help me ..

IF still my description of my issue is incomplete , please do let me know .. I will fill up the gap...

The output suggests that these are not working. Why do you have a second set of square brackets around [appEvents][attribute_num]?

Hi Badger

I tried with below options :

add_field => { "filesPendingForUpload" => "%{[appEvents][attribute_num][filesPendingForUpload]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][attribute_num][filesUploadedSinceLastEvent]}" }
add_field => { "filesPendingForUpload" => "%{[[appEvents][attribute_num]][filesPendingForUpload]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[[appEvents][attribute_num]][filesUploadedSinceLastEvent]}" }
add_field => { "filesPendingForUpload" => "%{[appEvents][[attribute_num][filesPendingForUpload]]}" }
                add_field => { "filesUploadedSinceLastEvent" => "%{[appEvents][[attribute_num][filesUploadedSinceLastEvent]]}" }

I came across some forums : for multi nested path use [] double square brackets ...

I tried above all 3 options , but none worked...

Could you please help me to achieve data of , below fields in logstash CSV output plugin. How to mention such fields ..

appEvents.attribute_num.filesPendingForUpload 
appEvents.attribute_num.filesUploadedSinceLastEvent

I also tried below :

  1. specifying "[appEvents][0][attribute_num][filesPendingForUpload]"
input {
  elasticsearch {
    hosts => "10.10.101.1:9200"
    size => 10000
    index => "backupindex"
  }
}



output {
  csv {
    fields => ["userId", "timeStamp", "[backupSettings][backgroundUploading]", "[backupSettings][connectionType]", "[backupSettings][contactBackup]", "[backupSettings][contactPermission]", "[backupSettings][photoBackup]", "[backupSettings][photoQuality]", "[backupSettings][storagePermission]", "[backupSettings][videoBackup]", "[appEvents][0][attribute_num][filesPendingForUpload]"]
    path => "/tmp/exp.csv"
  }
}
  1. specifying "[appEvents][attribute_num][filesPendingForUpload]"

Both did not helped me ... IN CSV , field was blank.

Can you replace your output with

output { stdout { codec => rubydebug } }

and show us what [appEvents] looks like?

Okay Sure ..

{
         "timeStamp" => 1623203744873,
        "@timestamp" => 2021-06-15T04:55:11.613Z,
          "@version" => "1",
            "userId" => "c0135af5feb34a77baee92b7551c4d06",
         "appEvents" => [
        [0] {
            "attribute_num" => {
                "filesUploadedSinceLastEvent" => 0,
                      "filesPendingForUpload" => 2
            }
        }
    ],
    "backupSettings" => {
             "connectionType" => "WLC",
        "backgroundUploading" => "Y",
          "contactPermission" => "Y",
                "photoBackup" => "CAM",
               "photoQuality" => "Original",
                "videoBackup" => "OFF",
              "contactBackup" => "Y",
          "storagePermission" => "Y"
    }
}

[appEvents] is an array, so you should be using "[appEvents][0][attribute_num][filesPendingForUpload]"

Hello Badger

I have already tried using this ...

Could you please have a look over this post ..

I do not know what to say. Both the mutate+add_field and the csv output will event.get "[appEvents][0][attribute_num][filesPendingForUpload]". If that field exists it should be expanded.

Are you getting this for every event?

Hi Badger

I did not used mutate this time ...

input {
  elasticsearch {
    hosts => "10.10.101.10:9200"
    size => 10000
    index => "backup"
  }
}



output {
  csv {
    fields => ["userId", "timeStamp", "[backupSettings][backgroundUploading]", "[backupSettings][connectionType]", "[backupSettings][contactBackup]", "[backupSettings][contactPermission]", "[backupSettings][photoBackup]", "[backupSettings][photoQuality]", "[backupSettings][storagePermission]", "[backupSettings][videoBackup]", "[appEvents][0][attribute_num][filesUploadedSinceLastEvent]", "[appEvents][0][attribute_num][filesPendingForUpload]"]
    path => "/tmp/exp.csv"
  }
}

Output :

6018a66855b6471ebf976b2d551c9731,1623241661046,Y,WLC,Y,Y,CAM,Original,Y,OFF,,

You can see last both fields are empty ...

they should contain data