What is the best solution for one of the basic requirements in log's analisis. "CALCULATE DURATION"
The problem is to calculate the duration between 2 events. For example:
Logs lines:
15-04-2016T10:00:00:000+UTC 1 start data
15-04-2016T10:00:00:001+UTC 2 start data
15-04-2016T10:00:00:004+UTC 2 end data
15-04-2016T10:00:00:005+UTC 1 end data
In that case, it's necessary to add a field called "Duration" on the "end events" and assign the subtract the time.
15-04-2016T10:00:00:000+UTC 1 start data 0
15-04-2016T10:00:00:001+UTC 2 start data 0
15-04-2016T10:00:00:004+UTC 2 end data 3
15-04-2016T10:00:00:005+UTC 1 end data 5
That's the solution if we want to calculte from the timestamp. The problem is that timestamp is assign by Logstash when an event comes. If we don't have a real time processing this isn't work. For that reason suppose an scenario like that.
Logstash Processing Time ..... ID ........ TAG ........ Real time for the event
If we used Elapsed plugin the duration will be 5 miliseconds, but the real duration should be calculate by Real Time field, in that case the duration will be 20 miliseconds.
@Rubytor You can overwrite the Logstash Processing Time with the real time for the event. Use the date filter for this purpose:
@Gnosis When you use the aggregate-filter you must set filter workers to 1. This isn't really nice.
You should be very careful to set logstash filter workers to 1 (-w 1 flag) for this filter to work correctly otherwise documents may be processed out of sequence and unexpected results will occur.
Maybe your shipper can do this. I depends on your usecase and infrastructure. Image you have two logstash-instances with a load-balancer in front of them. How you would ensure that all Events flow to the same LS-Instance?
Up to me, the right solution for your need is to use 'date' filter and then 'elapsed' filter.
date filter allows you to put your message date (ex: 15-04-2016T10:00:00:000+UTC) in @timestamp field.
then elapsed filter will compute the elapsed time between start event and end event (using @timestamp field) and will store duration in 'elapsed.time' field in end event.
But you have to know one thing : computed duration is in seconds. If you want a more precise duration (in milliseconds for example), you will have to use aggregate filter.
In all cases, you must first use 'date' filter to set message date in @timestamp field.
Here is the logstash configuration using aggregate filter :
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.