Transform each array item (nested object) in a new document - Splitting a field before removing it

Hi all,

I have a index with a nested field like this:

{
     "id_single_profile": "14e98c40-d5ed-11e8-aa2a-d991a7f7e009",
     "id_audience": "3c9b46e4-1b10-4ad4-9a68-ae371034adfe",
     "activities": [
                {
                  "id_activity": "activity-1",
                  "nm_product": "xx",              
                  "nm_product_category": "WEB",
                  "nm_establishment_category": "Softwares / APP"
                },
                {
                  "id_activity": "activity-2",
                  "nm_product": "yy",              
                  "nm_product_category": "WEB",
                  "nm_establishment_category": "Softwares / APP"              
                }
                ]
}

So I'm trying to transform each nested object in a new index document. Something like this:

[
{
        "_index": "doc1",
        "_type": "activity",
        "_id": "activity-1",
        "_source": {        
            "nm_product": "xx",              
            "nm_product_category": "WEB",
            "nm_establishment_category": "Softwares / APP"                  
            }
}            
{
        "_index": "doc1",
        "_type": "activity",
        "_id": "activity-2",
        "_source": {        
            "nm_product": "yy",              
            "nm_product_category": "WEB",
            "nm_establishment_category": "Softwares / APP"                  
            }
}            
]

--

I've been using this filter below in Logstash. I'm splitting by the "activities" nested field and this doing fine. The problem is to remove the nested field from the document after doing the split. When I use the "remove_field" to remove the "activities" field from output the split just don't work.

filter {
	split { field => "activities" }	

	mutate { 
	add_field => {
        "nm_product" => "%{[activities][nm_product]}"
	"nm_product_category" => "%{[activities][nm_product_category]}" 
	"nm_establishment_category" => "%{[activities][nm_establishment_category]}" 	
	}
	}
	
	mutate { remove_field => ["@version","@timestamp"] }  
  }

Is there something I'm missing?

I will appreciate any comments. Thanks!

Provided you do the remove_field after the split it works just fine for me.

Didn't work for me.
Adding a remove_field to "activities" the split don't work and returns just one document.

filter {
split { field => "activities" }	

mutate { remove_field => ["@version","@timestamp","activities"] }  

mutate { 
add_field => {
"nm_product_category" => "%{[activities][nm_product_category]}" 
"nm_establishment_category" => "%{[activities][nm_establishment_category]}" 
"vl_total" => "%{[activities][vl_total]}"
}
}

}

Do the remove of activities after you have copied the fields out of it using mutate+add_field.

Yes, I have tried putting the remove before and after the copy of fields. Both didn't work. Same results.

Seems like the order of the mutate's don't make any difference.

Order most certainly matters.

I've tried using metadata fields for copy the values but I got the same results.

I can split the nested object and add new fields based on the splitted fields. The problem is when I try to remove the field that I'm using to splitting.

Seems like I can not split a field before removing it in the same pipeline.

I cannot imagine why that would be. When I run

input { generator { count => 1 message => '{
 "id_single_profile": "14e98c40-d5ed-11e8-aa2a-d991a7f7e009",
 "id_audience": "3c9b46e4-1b10-4ad4-9a68-ae371034adfe",
 "activities": [
            {
              "id_activity": "activity-1",
              "nm_product": "xx",
              "nm_product_category": "WEB",
              "nm_establishment_category": "Softwares / APP"
            },
            {
              "id_activity": "activity-2",
              "nm_product": "yy",
              "nm_product_category": "WEB",
              "nm_establishment_category": "Softwares / APP"
            }
            ]
}' } }

filter {
    json { source => "message" }
    split { field => "activities" }

    mutate {
        add_field => {
            "nm_product" => "%{[activities][nm_product]}"
            "nm_product_category" => "%{[activities][nm_product_category]}"
            "nm_establishment_category" => "%{[activities][nm_establishment_category]}"
        }
    }

    mutate { remove_field => [ "@version", "@timestamp", "activities", "message" ] }
}

output { stdout { codec => rubydebug { metadata => false } } }

I get two events

{
              "id_audience" => "3c9b46e4-1b10-4ad4-9a68-ae371034adfe",
               "nm_product" => "xx",
"nm_establishment_category" => "Softwares / APP",
        "id_single_profile" => "14e98c40-d5ed-11e8-aa2a-d991a7f7e009",
      "nm_product_category" => "WEB",
 [...]
}
{
              "id_audience" => "3c9b46e4-1b10-4ad4-9a68-ae371034adfe",
               "nm_product" => "yy",
"nm_establishment_category" => "Softwares / APP",
        "id_single_profile" => "14e98c40-d5ed-11e8-aa2a-d991a7f7e009",
      "nm_product_category" => "WEB",

[...]
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.