A JSON array consists of objects - how to deal with?

Firstly, I do not understand your concern with indexing 200 separate documents.

That said, there are a couple of approaches you could take. One would be to iterate over the array in a ruby filter, like this

    ruby {
        code => '
            oldData = event.get("data")
            newData = []
            oldData.each { |x|
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.3"
                    x["name"] = x["iso.org.dod.internet.experimental.94.1.8.1.3"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.3"
                end
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.4"
                    x["s_status"] = x["iso.org.dod.internet.experimental.94.1.8.1.4"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.4"
                end
                if x.include? "iso.org.dod.internet.experimental.94.1.8.1.6"
                    x["s_string"] = x["iso.org.dod.internet.experimental.94.1.8.1.6"]
                    x.delete "iso.org.dod.internet.experimental.94.1.8.1.6"
                end
                if x.include? "index"
                    x.delete "index"
                end
                newData << x
            }
            event.set("data", newData)
        '
    }

Depending on the type of enrichment you are doing this may involve writing a lot more ruby code.

Alternatively, since you already have a set of filters that make the changes you want, you could split the array and then aggregate it again.

    ruby {
        init => '@index = 1'
        code => '
            event.set("[@metadata][index]", @index)
            @index += 1
        '
    }
    split { field => "data" }
    # Insert filters here

    aggregate {
        task_id => "%{[@metadata][index]}"
        code => '
            map["@timestamp"] ||= event.get("@timestamp")
            map["data"] ||= []
            map["data"] << event.get("data")
            event.cancel
        '
        push_map_as_event_on_timeout => true
        timeout => 6
    }
}

The usual caveats about aggregate apply: you must set pipeline.workers to 1 for this to work, and make sure pipeline.ordered has the value you want (true) (or auto in 7.x but not 8.x).

When using push_map_as_event_on_timeout the resulting event will only have the fields you add to the map, so if you have other fields you want to preserve then add lines similar to that for map["@timestamp"].

1 Like