Aggregate?

Rubytor · April 15, 2016, 8:55am

What is the best solution for one of the basic requirements in log's analisis. "CALCULATE DURATION"

The problem is to calculate the duration between 2 events. For example:

Logs lines:

15-04-2016T10:00:00:000+UTC 1 start data
15-04-2016T10:00:00:001+UTC 2 start data
15-04-2016T10:00:00:004+UTC 2 end data
15-04-2016T10:00:00:005+UTC 1 end data

In that case, it's necessary to add a field called "Duration" on the "end events" and assign the subtract the time.

15-04-2016T10:00:00:000+UTC 1 start data 0
15-04-2016T10:00:00:001+UTC 2 start data 0
15-04-2016T10:00:00:004+UTC 2 end data 3
15-04-2016T10:00:00:005+UTC 1 end data 5

Thank you to all.

Gnosis · April 15, 2016, 1:40pm

sorry i'm not sure to understand your question.

Rubytor · April 16, 2016, 10:47am

Just calculate the time between 2 event.

The events has an ID to correlate them and a flag like start, step1, setp2, end.

I wanna know the duration from the start to step1, from start to step2, and total duration.. from start to end.

Thank you

jupp · April 16, 2016, 7:38pm

Maybe this is the right filter :

https://www.elastic.co/guide/en/logstash/current/plugins-filters-elapsed.html

Rubytor · April 18, 2016, 7:27am

That's the solution if we want to calculte from the timestamp. The problem is that timestamp is assign by Logstash when an event comes. If we don't have a real time processing this isn't work. For that reason suppose an scenario like that.

Logstash Processing Time ..... ID ........ TAG ........ Real time for the event

15-04-2016T10:00:00:000+UTC.... 1..... start ........ 15-04-2016T9:00:30:000+UTC
15-04-2016T10:00:00:005+UTC.... 1 ..... end ....... 15-04-2016T9:00:30:020+UTC

If we used Elapsed plugin the duration will be 5 miliseconds, but the real duration should be calculate by Real Time field, in that case the duration will be 20 miliseconds.

How can we do in that case???

Gnosis · April 18, 2016, 9:17am

Hi,

You should use the aggregate if you follow tis example https://www.elastic.co/guide/en/logstash/current/plugins-filters-aggregate.html

[https://www.elastic.co/static/img/elastic-logo-200.png]https://www.elastic.co/guide/en/logstash/current/plugins-filters-aggregate.html

aggregate - Elastichttps://www.elastic.co/guide/en/logstash/current/plugins-filters-aggregate.html
www.elastic.co
The aim of this filter is to aggregate information available among several events (typically log lines) belonging to a same task, and finally push aggregated ...

it should be fine by subtracting the time no ?

jupp · April 18, 2016, 6:47pm

@Rubytor You can overwrite the Logstash Processing Time with the real time for the event. Use the date filter for this purpose:

@Gnosis When you use the aggregate-filter you must set filter workers to 1. This isn't really nice.

You should be very careful to set logstash filter workers to 1 (-w 1 flag) for this filter to work correctly otherwise documents may be processed out of sequence and unexpected results will occur.

Gnosis · April 19, 2016, 7:43am

Ok, thx a lot and how do you aggregate data without the aggregate-filter plz ?

jupp · April 19, 2016, 11:31am

Maybe your shipper can do this. I depends on your usecase and infrastructure. Image you have two logstash-instances with a load-balancer in front of them. How you would ensure that all Events flow to the same LS-Instance?

Rubytor · April 19, 2016, 1:01pm

What about Filebeat?? I think it has it's own load balancer isn't it??

fbaligand · April 24, 2016, 9:12pm

Up to me, the right solution for your need is to use 'date' filter and then 'elapsed' filter.

date filter allows you to put your message date (ex: 15-04-2016T10:00:00:000+UTC) in @timestamp field.
then elapsed filter will compute the elapsed time between start event and end event (using @timestamp field) and will store duration in 'elapsed.time' field in end event.

But you have to know one thing : computed duration is in seconds. If you want a more precise duration (in milliseconds for example), you will have to use aggregate filter.
In all cases, you must first use 'date' filter to set message date in @timestamp field.

Here is the logstash configuration using aggregate filter :

date {
     match => [ "_timestamp", "dd-MM-yyyy'T'HH:mm:ss:SSSZ" ]
}

if [start] {
        aggregate {
             task_id => "%{taskid}"
            map_action => "create"
             code => "map['start_timestamp'] = event['@timestamp']"
        }
}
if [end] {
        aggregate {
             task_id => "%{taskid}"
            map_action => "update"
             code => "event['duration'] = event['@timestamp'] - map['start_timestamp']"
            end_of_task => true
        }
}

Hope it helps.

Topic		Replies	Views
Calculate time between different events with same ID Logstash	1	299	May 31, 2020
Add a time field in elasticsearch and calculate time between two event Logstash docker	4	381	June 28, 2022
How to use aggregate function in logstash to calculate difference between 'elapsed' time fields? Logstash	5	955	May 26, 2020
Compute difference between 2 timestamps Logstash	3	2611	July 12, 2017
Calculate Date difference in LOGSTASH based on conditional text Logstash	6	1525	June 21, 2017

Aggregate?

Related topics