A JSON array consists of objects - how to deal with?

awer1967 · February 16, 2021, 6:35pm

Hi !

I strongly apologize for a possibly stupid, kinda "it-has-been-asked-a-lot-of-time", question. Nevertheless I have to ask for help because I have completely stuck with .

So that I have a JSON array consist of objects looks like

"data": [
 { "iso.org.dod.internet.experimental.94.1.8.1.7": 8, "index": "32.128.0.192.255.39.112.187.0.0.0.0.0.0.0.0.1", "iso.org.dod.internet.experimental.94.1.8.1.4": 3, "iso.org.dod.internet.experimental.94.1.8.1.6": "On-Board Temperature 1-Ctlr A: 22 C 71.60F", "iso.org.dod.internet.experimental.94.1.8.1.3": "On-Board Temperature 1-Ctlr A" }, 
{ "iso.org.dod.internet.experimental.94.1.8.1.7": 8, "index": "32.128.0.192.255.39.112.187.0.0.0.0.0.0.0.0.2", "iso.org.dod.internet.experimental.94.1.8.1.4": 3, "iso.org.dod.internet.experimental.94.1.8.1.6": "On-Board Temperature 1-Ctlr B: 23 C 73.40F", "iso.org.dod.internet.experimental.94.1.8.1.3": "On-Board Temperature 1-Ctlr B" }, 
{ "iso.org.dod.internet.experimental.94.1.8.1.7": 8, "index": "32.128.0.192.255.39.112.187.0.0.0.0.0.0.0.0.3", "iso.org.dod.internet.experimental.94.1.8.1.4": 3, "iso.org.dod.internet.experimental.94.1.8.1.6": "On-Board Temperature 2-Ctlr A: 25 C 77.00F", "iso.org.dod.internet.experimental.94.1.8.1.3": "On-Board Temperature 2-Ctlr A" }, ]

Accordingly to my needs I have to make a next set of operations with every member of the array

enrich every one with some additional fields
remove an "index" field
set a human-readable name for a fields with MIB-like names
get a performance value from a status string (e.g. Temp=22 for a first member)

I would be able to do it if I split this array by doing

  filter{ split {field =>"[data"}}

Yess...but there is an little thing , that [data] array holds almost 200 members. So that I obtain almost 200 documents ready to convert . Really ready, here is a one of them as a result

 "instance": "128_128", 
  "s_string": "On-Board Temperature 1-Ctlr A: 22 C 71.60F",
 "serialno": "5R6693C077",
"@timestamp": "2021-02-16T11:06:58.229Z", 
 "object": "MSA_2050", 
"name": "On-Board Temperature 1-Ctlr A", 
 "s_status": "3",
 "Temp": 22, 
 "location": "ArcDC", 
"parentname": "board"

Actually, getting almost 200 documents instead of one every time I get a query from logstash, looks weird 8( to me 8) personally to me 8).
I strongly suspect that if I tried to put data on a that way into my ES index I would get a trouble sooner or later.

So let me ask you.
Is there a way to do that without splitting ? How can I deal with my array to do all the conversions inside ? I have known about ruby code, but I see "no wall to lean on" literally (I am the complete teapot in Ruby ) . Wouldn't you be so kind to show me the point to start from ?

Any help would be appreciated
Thanks a lot in advance.

Badger · February 16, 2021, 7:36pm

Firstly, I do not understand your concern with indexing 200 separate documents.

That said, there are a couple of approaches you could take. One would be to iterate over the array in a ruby filter, like this

    ruby {
        code => '
            oldData = event.get("data")
            newData = []
            oldData.each { |x|
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.3"
                    x["name"] = x["iso.org.dod.internet.experimental.94.1.8.1.3"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.3"
                end
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.4"
                    x["s_status"] = x["iso.org.dod.internet.experimental.94.1.8.1.4"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.4"
                end
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.6"
                    x["s_string"] = x["iso.org.dod.internet.experimental.94.1.8.1.6"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.6"
                end
                if x.include? "index"
                    x.delete "index"
                end
                newData << x
            }
            event.set("data", newData)
        '
    }

Depending on the type of enrichment you are doing this may involve writing a lot more ruby code.

Alternatively, since you already have a set of filters that make the changes you want, you could split the array and then aggregate it again.

    ruby {
        init => '@index = 1'
        code => '
            event.set("[@metadata][index]", @index)
            @index += 1
        '
    }
    split { field => "data" }
    # Insert filters here

    aggregate {
        task_id => "%{[@metadata][index]}"
        code => '
            map["@timestamp"] ||= event.get("@timestamp")
            map["data"] ||= []
            map["data"] << event.get("data")
            event.cancel
        '
        push_map_as_event_on_timeout => true
        timeout => 6
    }
}

The usual caveats about aggregate apply: you must set pipeline.workers to 1 for this to work, and make sure pipeline.ordered has the value you want (true) (or auto in 7.x but not 8.x).

When using push_map_as_event_on_timeout the resulting event will only have the fields you add to the map, so if you have other fields you want to preserve then add lines similar to that for map["@timestamp"].

awer1967 · February 16, 2021, 8:00pm

Thank a lot !
You have given me a really excellent way to things as I wanted ! MERCI BEAUCOUP !

So I have to explain my wish not to increase a documents amount . As far as I know it would be better to keep data in an one , big document, than spread them among hundreds small ones. It really makes shorter a time to search data and keeps an index space from overgroving. It is my opinion so it could be wrong 8) That's why I am asking you .

Again, my best wishes to you ! I will dig further (based on your answer) to make it on the my manner . 8)

Sincerely yours

awer1967 · February 16, 2021, 9:07pm

Hi again !

Badger, I would ask you about a piece of advice with Ruby programming under ELK.

Is there some resources where I get to know how to do it ? Sure I know that I wouldn't get it quickly but a short course or some book , you may point to, might be very useful.

Thanks again.

Badger · February 16, 2021, 9:28pm

I learned Ruby by googling questions about it and reading lots of results from StackOverflow and various Ruby related sites. Also by reading the logstash filter source code on github to answer questions about how particular options on filters work.

The nice thing about using ruby in logstash is that you are often doing very simple tasks. You do not need to write a complete program that has complex functionality, so you can learn a little bit here and a little bit there and very gradually build up skills.

awer1967 · February 16, 2021, 9:33pm

Fair enough !
The way I am too 8)

A lot of thanks !

system · March 16, 2021, 9:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parse/split nested single JSON array in Logstash Logstash	5	1916	February 22, 2017
Parsing a JSON array of objects Logstash	6	12461	July 6, 2017
LS 1.5.1: Split json including arrays Logstash	1	1105	July 6, 2017
Process array elements as objects Logstash	3	803	January 4, 2018
Remove data from json array in Logstash Logstash	4	2178	July 6, 2017

A JSON array consists of objects - how to deal with?

Related topics