Filtering on nested json fields of varying length

Hello everyone,
First of all, I am new to Logstash, so I would not be surprised if what I am asking is not within the scope of what Logstash can do.
So, I am working on a project where I need to do multiple GeoIP lookups on a nested json string. Here is what the input string would look like, beautified:

{
  "testid": "abdc",
  "ipversion": "v4",
  "flow": [
    {
      "destination_port": 33434,
      "dub": [
        {
          "index": 1,
          "ip": "192.168.1.254"
        },
        {
          "index": 2,
          "ip": "82.255.159.254"
        },
        {
          "index": 3,
          "ip": "82.255.210.12"
        }
      ]
    },
    {
      "destination_port": 33438,
      "dub": [
        {
          "index": 1,
          "ip": "172.2.12.12"
        },
        {
          "index": 2,
          "ip": "84.8.15.123"
        }
      ]
    }
  ]
}

What I want is to add a GeoIP lookup result field under each "ip" field. I tried to brute-force it by applying filters to every field of the array (even non-existent ones), which worked but was very ugly. I am wondering if there exists a cleaner way to do it by using the Split filter or some other trick I am not yet aware of.

Many thanks in advance,

You could add an id field to the event, split each of the arrays, and then aggregate based on the id field, but I am not sure that counts as "cleaner".

Thank you for the suggestion, I have been tinkering with a Ruby filter in the meantime, I will go back to your solution if mine is failing. Thank you for your time anyway!

Nevermind, I managed to solve my problem with a Ruby filter. I'm leaving my solution here in case it helps someone:

input {
  file { 
    path => "/path/to/file/file.json"
    codec => "json"
    start_position => "beginning"
  }
}
filter {
   ruby {
      init => "
         require 'logstash/filters/geoip'
      "
      code => "
         nflow = event.get('flow_number') - 1
         for idflow in 0..nflow do
             nhops = event.get(%([flow][#{idflow}])+'[hops_count]') - 1
             for idhop in 0..nhops do
                @geoip = LogStash::Filters::GeoIP.new({'source' => %([flow][#{idflow}][dub][#{idhop}][ip]),
                                       'target' => %([flow][#{idflow}][dub][#{idhop}][geoip])
                                       })

                @geoip.register

                @geoip.filter(event)

             end
         end
      "
   }
}

Not the cleanest, but the code is short and completes in a reasonable time.

I'll mark this post as a solution so the topic can be closed.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.