Iterating over fields with multiple filter processes

philhagen · January 14, 2025, 2:49pm

I run a series of consistent filters over a series of fields that may or may not exist. For consistency and to simplify maintenance, I'd like to do this with one code block instead of multiple repetitive ones. For a specific example, I have a bunch of different IP address fields, but want to perform City and ASN GeoIP lookups, tag the record based on the class of IP, then add it to the related.ip field. Example for one of these blocks is here.

I suspect this will need a ruby script that would be called for each source field (or maybe with an array of the field names that would then be iterated over within the ruby script. However, I'm having trouble finding any applicable examples of how to call a filter from within that script. I've cobbled together the following "first step" approach based on several references found here and elsewhere, but it doesn't seem to do the trick.

Any pointers greatly appreciated.

require 'logstash/filters/geoip'

def register(params)
    @source_field = params["source_field"]
end

def filter(event)

    source = event.get(@source_field)

    @asn_geo = LogStash::Filters::GeoIP.new({'database' => '/usr/local/share/GeoIP/GeoLite2-ASN.mmdb',
                                             'default_database_type' => 'ASN',
                                             'source' => source
                                            })

    @asn_geo.filter(event)

    return [event]
end

edited to add: This is called from the configuration with the following:

ruby {
  path => "/path/to/above/ip_enrich.rb"
  script_params => {
    "source_field" => "[source][ip]"
  }
}

Badger · January 14, 2025, 4:30pm

The following works, in that it instantiates the filter and calls it.

    mutate { add_field => { "[foo][ip]" => "140.222.225.135" } }

    ruby {
        init => '
            require "logstash/filters/geoip"

            @geoip_filter = LogStash::Filters::GeoIP.new( {
                                         "database" => "/tmp/geoip_database_management/1736869833/GeoLite2-City.mmdb",
                                         "default_database_type" => "City",
                                         "source" => "[foo][ip]"
                                        } )
        '
        code => '
            @geoip_filter.filter(event)
        '
    }

However, it refuses to do the lookup and always tags the event with "_geoip_expired_database". This is true even if I add a geoip filter to the configuration and strace shows them both calling openat on the same .mmdb file.

I also cannot get the instantiated filter to log anything.

It's unclear why this would be useful even if it worked. The "source" option is stored when the filter class is registered. It is not a variable. You cannot have a ruby script iterate over a list of fields and call the filter for each one, it will have to create one filter per field. In which case it is going to be easier to do that in the logstash configuration, leaving ruby out of it.

philhagen · January 14, 2025, 6:32pm

Hm. Ok, well maybe it's reassuring that this isn't just me?! haha

The reason for doing this is to put all the IP enrichment code into one place rather than repeating it for the IP subfield under all of: source, destination, dns.answers, client, remote, backend, xff, zeek.ftp.data_channel.originating, zeek.ftp.data_channel.response, original_source, and original_destination (so far). Just hoping to take advantage of repeatability and minimize the opportunity to get all the same IP address post-processing out of sync across fields.

At a bare minimum, I was hoping to call a single enrichment script for each field. That would at least centralize all the equivalent processing sequences into one place. Best (imagined) case would be the list approach.

Still crossing fingers that there is a way, but if it's not obvious for someone like you who's far more versed in these functions, I might be inclined to accept the less elegant but functional approach. Thank you!

Topic		Replies	Views
Logstash-iterating through array field with specific action for every value Logstash	2	1608	May 7, 2021
Nested for loop in logstash / calling logstash filters from ruby Logstash	1	470	January 19, 2021
How to loop through json arrays and apply filter Logstash	4	4785	July 6, 2017
Logstash: how to use multiple geoip filter in one message Logstash	2	1147	July 6, 2017
Filtering all Fields/Events Logstash	2	1688	December 1, 2020

Iterating over fields with multiple filter processes

Related topics