A JSON array consists of objects - how to deal with?

Hi !

I strongly apologize for a possibly stupid, kinda "it-has-been-asked-a-lot-of-time", question. Nevertheless I have to ask for help because I have completely stuck with .

So that I have a JSON array consist of objects looks like

"data": [
 { "iso.org.dod.internet.experimental.94.1.8.1.7": 8, "index": "32.128.0.192.255.39.112.187.0.0.0.0.0.0.0.0.1", "iso.org.dod.internet.experimental.94.1.8.1.4": 3, "iso.org.dod.internet.experimental.94.1.8.1.6": "On-Board Temperature 1-Ctlr A: 22 C 71.60F", "iso.org.dod.internet.experimental.94.1.8.1.3": "On-Board Temperature 1-Ctlr A" }, 
{ "iso.org.dod.internet.experimental.94.1.8.1.7": 8, "index": "32.128.0.192.255.39.112.187.0.0.0.0.0.0.0.0.2", "iso.org.dod.internet.experimental.94.1.8.1.4": 3, "iso.org.dod.internet.experimental.94.1.8.1.6": "On-Board Temperature 1-Ctlr B: 23 C 73.40F", "iso.org.dod.internet.experimental.94.1.8.1.3": "On-Board Temperature 1-Ctlr B" }, 
{ "iso.org.dod.internet.experimental.94.1.8.1.7": 8, "index": "32.128.0.192.255.39.112.187.0.0.0.0.0.0.0.0.3", "iso.org.dod.internet.experimental.94.1.8.1.4": 3, "iso.org.dod.internet.experimental.94.1.8.1.6": "On-Board Temperature 2-Ctlr A: 25 C 77.00F", "iso.org.dod.internet.experimental.94.1.8.1.3": "On-Board Temperature 2-Ctlr A" }, ] 

Accordingly to my needs I have to make a next set of operations with every member of the array

  • enrich every one with some additional fields
  • remove an "index" field
  • set a human-readable name for a fields with MIB-like names
  • get a performance value from a status string (e.g. Temp=22 for a first member)

I would be able to do it if I split this array by doing

  filter{ split {field =>"[data"}}

Yess...but there is an little thing , that [data] array holds almost 200 members. So that I obtain almost 200 documents ready to convert . Really ready, here is a one of them as a result

 "instance": "128_128", 
  "s_string": "On-Board Temperature 1-Ctlr A: 22 C 71.60F",
 "serialno": "5R6693C077",
"@timestamp": "2021-02-16T11:06:58.229Z", 
 "object": "MSA_2050", 
"name": "On-Board Temperature 1-Ctlr A", 
 "s_status": "3",
 "Temp": 22, 
 "location": "ArcDC", 
"parentname": "board"

Actually, getting almost 200 documents instead of one every time I get a query from logstash, looks weird 8( to me 8) personally to me 8).
I strongly suspect that if I tried to put data on a that way into my ES index I would get a trouble sooner or later.

So let me ask you.
Is there a way to do that without splitting ? How can I deal with my array to do all the conversions inside ? I have known about ruby code, but I see "no wall to lean on" literally (I am the complete teapot in Ruby ) . Wouldn't you be so kind to show me the point to start from ?

Any help would be appreciated
Thanks a lot in advance.

Firstly, I do not understand your concern with indexing 200 separate documents.

That said, there are a couple of approaches you could take. One would be to iterate over the array in a ruby filter, like this

    ruby {
        code => '
            oldData = event.get("data")
            newData = []
            oldData.each { |x|
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.3"
                    x["name"] = x["iso.org.dod.internet.experimental.94.1.8.1.3"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.3"
                end
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.4"
                    x["s_status"] = x["iso.org.dod.internet.experimental.94.1.8.1.4"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.4"
                end
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.6"
                    x["s_string"] = x["iso.org.dod.internet.experimental.94.1.8.1.6"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.6"
                end
                if x.include? "index"
                    x.delete "index"
                end
                newData << x
            }
            event.set("data", newData)
        '
    }

Depending on the type of enrichment you are doing this may involve writing a lot more ruby code.

Alternatively, since you already have a set of filters that make the changes you want, you could split the array and then aggregate it again.

    ruby {
        init => '@index = 1'
        code => '
            event.set("[@metadata][index]", @index)
            @index += 1
        '
    }
    split { field => "data" }
    # Insert filters here

    aggregate {
        task_id => "%{[@metadata][index]}"
        code => '
            map["@timestamp"] ||= event.get("@timestamp")
            map["data"] ||= []
            map["data"] << event.get("data")
            event.cancel
        '
        push_map_as_event_on_timeout => true
        timeout => 6
    }
}

The usual caveats about aggregate apply: you must set pipeline.workers to 1 for this to work, and make sure pipeline.ordered has the value you want (true) (or auto in 7.x but not 8.x).

When using push_map_as_event_on_timeout the resulting event will only have the fields you add to the map, so if you have other fields you want to preserve then add lines similar to that for map["@timestamp"].

1 Like

Thank a lot !
You have given me a really excellent way to things as I wanted ! MERCI BEAUCOUP !

So I have to explain my wish not to increase a documents amount . As far as I know it would be better to keep data in an one , big document, than spread them among hundreds small ones. It really makes shorter a time to search data and keeps an index space from overgroving. It is my opinion so it could be wrong 8) That's why I am asking you .

Again, my best wishes to you ! I will dig further (based on your answer) to make it on the my manner . 8)

Sincerely yours

Hi again !

Badger, I would ask you about a piece of advice with Ruby programming under ELK.

Is there some resources where I get to know how to do it ? Sure I know that I wouldn't get it quickly but a short course or some book , you may point to, might be very useful.

Thanks again.

I learned Ruby by googling questions about it and reading lots of results from StackOverflow and various Ruby related sites. Also by reading the logstash filter source code on github to answer questions about how particular options on filters work.

The nice thing about using ruby in logstash is that you are often doing very simple tasks. You do not need to write a complete program that has complex functionality, so you can learn a little bit here and a little bit there and very gradually build up skills.

1 Like

Fair enough !
The way I am too 8)

A lot of thanks !