I run a series of consistent filters over a series of fields that may or may not exist. For consistency and to simplify maintenance, I'd like to do this with one code block instead of multiple repetitive ones. For a specific example, I have a bunch of different IP address fields, but want to perform City and ASN GeoIP lookups, tag the record based on the class of IP, then add it to the related.ip field. Example for one of these blocks is here.
I suspect this will need a ruby script that would be called for each source field (or maybe with an array of the field names that would then be iterated over within the ruby script. However, I'm having trouble finding any applicable examples of how to call a filter from within that script. I've cobbled together the following "first step" approach based on several references found here and elsewhere, but it doesn't seem to do the trick.
However, it refuses to do the lookup and always tags the event with "_geoip_expired_database". This is true even if I add a geoip filter to the configuration and strace shows them both calling openat on the same .mmdb file.
I also cannot get the instantiated filter to log anything.
It's unclear why this would be useful even if it worked. The "source" option is stored when the filter class is registered. It is not a variable. You cannot have a ruby script iterate over a list of fields and call the filter for each one, it will have to create one filter per field. In which case it is going to be easier to do that in the logstash configuration, leaving ruby out of it.
Hm. Ok, well maybe it's reassuring that this isn't just me?! haha
The reason for doing this is to put all the IP enrichment code into one place rather than repeating it for the IP subfield under all of: source, destination, dns.answers, client, remote, backend, xff, zeek.ftp.data_channel.originating, zeek.ftp.data_channel.response, original_source, and original_destination (so far). Just hoping to take advantage of repeatability and minimize the opportunity to get all the same IP address post-processing out of sync across fields.
At a bare minimum, I was hoping to call a single enrichment script for each field. That would at least centralize all the equivalent processing sequences into one place. Best (imagined) case would be the list approach.
Still crossing fingers that there is a way, but if it's not obvious for someone like you who's far more versed in these functions, I might be inclined to accept the less elegant but functional approach. Thank you!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.