Flatting nested array in Json

I have a Json something like this , I trying to flatten it before storing in Elasticsearch. all nested elements should be on level 1. I read about ruby script. I am new to ELK and ruby. someone can help me out?

        {
  **"school":"Someschool",
  ** "user":[
  **	{
  **		"name":"user1",
  **		"subject":[{
  **			"score":7,
  **			"topic":"science",
  **			"date":{"day":02,"month":04,"year":2020}
  **		},
  **		{
  **			"score":6,
  **			"topic":"Maths",
  **			"date":{"day":02,"month":04,"year":2020}
  **		}]
  **	},
  **	{
  **		"name":"user2",
  **		"subject":[{
  **			"score":9,
  **			"topic":"science",
  **			"date":{"day":02,"month":04,"year":2020}
  **		},
  **		{
  **			"score":4,
  **			"topic":"Maths",
  **			"date":{"day":02,"month":04,"year":2020}
  **		}]
  **	}
  ** ],
  ** 
  ** "addess":[
  **	{"type":"present","area":"somearea","zip":12323},
  **	{"type":"temp","area":"somearea","zip":342}
  **	],
  ** "reg":{"year":1990,"day":12,"month":01}
  **}

What do you mean by that? If nested elements like name and subject are moved to the top level there can only be one of them, so you will lose data.

I would like to append index , something like this,

{
** "school":"Someschool",
** "user.name":"",**
** "user.subject.score-0":"7",**

** -----**
** "user.subject.score-1":"6"**
}

You would need to use a ruby filter. This might help you get started. You will need to add array handling.

@Badger Thanks, I am already using this for JSON Object. Before that, I need to flatten array. With this link, my JSON object is getting flatten but it doesn't work on an array.
Also the above link we need to pass each field, imagine Json with ten inner JSON object we need to call ten times.
Just help me in iterating array dynamically and getting a key.
This is my plan.

  1. Get all array dynamically
  2. Iterate them and use the above link to flatten JSON
    Repeat it for all.
    I am new this, I just started exploring from last few days.
    Thanks

No, it recursively processes the contents of hashes.

1 Like

For iterating over the array dynamically, you can use the split filter on the field.

Split filter - https://www.elastic.co/guide/en/logstash/current/plugins-filters-split.html#plugins-filters-split-field

First run the split over user and then another split over subject. This will give you one event per user per subject and then use @Badger's code to recursively process the hashes.

filter {
  json{
    source => "message"
  }
  split{
    field => "user"
  }
  split{
    field => "[user][subject]"
  }
  # code to process the hashes recursively here
}

What is your end goal? If you flatten this data you will lose the object relations as @Badger mentioned. Also, can you please share the mapping if you are using one?

1 Like

Ullas,

See my topic bellow where Badger helped me.

https://discuss.elastic.co/t/split-a-json-array-with-same-fields-names/233008/12

I think it can be used in your case.

1 Like

@Rahul_Kumar4 Thanks.
This looks simpler and easy approach. Unfortunately, there are few challenges

  1. Split on multiple fields at the same level not working ( user and address, both are arrays and same level). Need to figure it out.

When I tried field =>[user][address] I get an error Only String and Array types are splittable. field:[user][address] is of type = NilClass

  1. When I split array I get {"k":"v"} and @Badger function needs field name to be passed "fn":{"k":"v"} to flatten it .
    I am still figuring out the right approach

I just want to bring all nested keys to the first level. Technically just one { and one }.
Trying to see if it is possible to achieve. When there are two arrays it is getting complicated.

@all, when we use a split filter, does it creates two output out of one json input data? In my case, it is inserting two records with the same data except for the split keys in the elastic search.

@Badger Yeah it processes recursively on nested json. I ment parallel json Object. Lets say

{
   "A": {"A1":"VA1","A2":"VA2"},
   "B": {"B1":"VB1","B2":"VB2"}
}

In this case I need to call twice with A and B. I made small change in the ruby script (from your link) and able to process by sending comma separated values Sharing for future readers

def register(params)
    @field = params['field']
end

def flatten(object, name, event)
    if object
        if object.kind_of?(Hash) || object == {} || object==[] 
            object.each { |k, v| flatten(v, "#{name}.#{k}", event) }
        else
            event.set(name, object)
        end
    end
end

def filter(event)
	inputs = @field.split(',')
            for input in inputs
                 o = event.get(input)
				if o
					flatten(o, input, event)
				end
				event.remove(input)
				
            end
   [event]
end

Thank you for a wonderful and quick response.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.