I am writing a custom filter plugin based loosely on the Throttle.rb
filter. The filter needs to deduplicate some netflow data by filtering all the data related to the same flow and producing a new event that is an aggregation of the flow data after a time window has passed. For instance, if there are 3 events sent in 5 seconds that are all from the same flow, the only event that would pass through the filter (i.e. the event that filter_matched(event)
would get called on) would be a new aggregate event that has an averaged form of the data of those 3 initial events.
The problem is that there is no way to know whether a particular event is the last one that will happen in that time window until the time window has passed. And by then this event is gone - it either had its tags and fields added, etc. or it did not. So the workaround I'm thinking of is to create a new event and just call something like add_tag("deduplicated")
manually in the ruby file for my custom filter itself, like how the filter base class actually does it internally. Is this the correct way to handle the situation, or am I missing something?
Let me know if you need more details. Thanks in advance!