Logstash-iterating through array field with specific action for every value

Hi,logstash wizards! I suppose ruby must be used for what I need but I tried many many ways with no success:( I parse DNS responses on Logstash and as a result I have field below with array of values:

    {"dns": {
          "answers": {
            "data": [
              "apple.com.akadns.net",
              "17.57.146.53",
              "17.57.146.52",
              "17.57.146.68",
              "17.57.146.69"
            ]
    }

This field is-of course-data type of text. I would need to copy only IP addresses from this array (eg thanks to regex \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) and save them to new array field with data type of IP. Result should look like:

{"dns": {
      "answers": {
        "data_ip": [
          "17.57.146.53",
          "17.57.146.52",
          "17.57.146.68",
          "17.57.146.69"
        ]
}

Then I would need to use geoip plugin and store ASN numbers, company name etc for every IP address from field [dns][answer][data_ip] to new array field. Result should look like:

{"dns": {
      "answers": {
        "data_geo_asn": [
          "714",
          "714",
          "714",
          "714"
        ]
}
{"dns": {
      "answers": {
        "data_geo_company": [
          "Apple",
          "Apple",
          "Apple",
          "Apple"
        ]
}

Used geoip plugin should look similarly like:

    geoip {
      database => "/etc/logstash/conf.d/GeoIP2-ISP.mmdb"
      source => "[dns][answer][data_ip]"
      target => "[dns][answer][data_geo]"
      tag_on_failure => ["_geoip_asn_lookup_failure"]
      }

Really appreciate any help! Jan

This part is fairly simple. Something like this (which I have not tested):

ruby {
    code => '
        data = event.get("[dns][answers][data]")
        if data
            event.set("[dns][answers][data_ip]", data.delete_if { |x| ! x.match(/\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/) } )
        end
    '
}

However, if passed an array a geoip filter only parses the first entry. One possibility would be to split on the data_ip array so that you have multiple events, call geoip, then aggregate them. Something similar to this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.