Aggregation filter timeout settings

I'm having some trouble with the logstash aggregate filter.

We are trying to combine sflow events together. These events are not sent to logstash in real time. The sensor data is collected and sent to us in a batch on a daily basis. We run a script on these batches to pull the data into RabbitMQ. This is where logstash begins.

Because sflow is made up of disparate events, there is no definitive start or stop. We are using the fingerprint filter to combine 5 values in the sflow data to mark the aggregation. So, if these five fields match, we want to consider this part of the same event, within a certain timeout.

The relevant fields to support this seem to be timeout, inactivity_timeout, and timeout_timestamp_field. We have a field that marks the start of an sflow event, which we use for timeout_timestamp_field. It is converted from a unix timestamp to a logstash date.

We set the inactivity_timeout to 330 (5.5 minutes) and the timeout to 86400 (1 day).

So, if I have 5 events pulled into logstash, and the start time of each event is less than the 5.5 minutes after the start time of the previous event, I think this should all be aggregated into a single map.

However, what seems to be happening is that it marks the start time of the first event, and when 5.5 minutes from that time passes, it triggers the inactivity_timeout and completes the map.

Am I misunderstanding the inactivity_timeout? Is it not supposed to behave like a rolling 5.5 minute window until we don't see another event in that time period, or until the larger 1 day timeout happens?

Also, since these are logs from times past, how does the timeout work when all the events for the day have been processed? It seems like it works on a real time inactivity_timeout value, which would mean inactivity_timeout actually has two meanings, real time, and event time.

Thanks.

If it matters, OS is CentOS 7 (7.6 1810), Java is OpenJDK 8, LogStash is 7.30.

Does timeout_timestamp_field help? That controls the "reference" time used for both timeout and inactivity_timeout.

I thought it wasn't working yesterday, but it seems to be working, so maybe I was reading the times wrong.

I'll post again if I have other questions.

Thanks.

I'm still having trouble understanding when logstash will timeout an aggregation when the timeout_timestamp_field is set.

If logstash has parsed the entire log file, and I don't give it any more log files, but it is still waiting on a timeout for an aggregation, what is that timeout?

I have log files that take place over the course of a single day.
The timeout value is 86400 (1 day)
The inactivity_timeout is set to 330 (5.5 minutes).

I thought all the in-memory logstash aggregations would timeout 5.5 minutes after the last event was read. But I'm still watching it push aggregation maps as events after an hour. So, what is the timeout when there are no new events to consider?

I've been looking over the logstash ruby code. It appears that when the aggregate filter is flushed, the remove_expired_maps() function could potentially be called. This is what is triggering the removal of maps when no new events are coming in.

I was looking at this line.

It appears to be saying that we would wait either for the timeout period to pass, or for the difference in time between the last event and the first event to be less than inactivity_timeout seconds before now.

So the maps get held longer the more events are aggregated, even when it is all done processing events.

Am I reading this correctly? Is this intended behavior?

I changed it to just using the creation time rather than the last event timestamp, and this makes the behavior work the way I originally expected. The timeout happens inactivity_timeout seconds after now. But this would obviously be wrong for events coming in real time. I'm just not sure if it would also be wrong for events coming from parsed logs alone. Could you still be aggregating log events after the inactivity timeout is passed in real time, which would trigger the map removal before it was really time?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.