Clarify how timeout_timestamp_field works with timeouts in aggregate filter

I am aggregating flows using timeouts based on a timestamp field in the events (start_time), and I push_map_as_event_on_timeout. Is my understanding of how this works correct?

An event comes in with a new task_id (x), so a map is created for x having creation_time = last_start_time = event's start_time. When the next event with task_id=x arrives, its start_time is compared to last_start_time and creation_time. If the differences are < inactivity_timeout and < timeout, respectively, aggregate the 2 events using "code", update last_start_time in the map to the current event's start_time, and keep going. However, if start_time - last_start_time > inactivity_timeout OR if start_time - creation_time > timeout, push the map as an event and start a new map; creation_time and last_start_time are set to the new event's start_time.

What happens if there are no more events with task_id = x? How/when does the final map for x get flushed?

My reading is that it decides whether to timeout the existing map element before it even looks at the current event.

flush() is called every 5 seconds. That also checks for expired maps.

That would make sense if you are not using timeout_timestamp_field (ie, using clock-time) but otherwise, it should use the specified field value in making the decision to expire/timeout. Right?

Yes, it does that. By storing the offset from the current time when creating a map entry it can add that back to the time on the event to get a time close to current time. Every 5 seconds it checks that last value to see if the delta from the current time is greater than the timeout interval.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.