sorry I was not clear before, let me explain this scenario in detail. suppose I have back-end application which is generating logs 100 records per second. and logs are consists of system log and application log combine in generic format.
{"log_time":"01/01/2018 00:00:00.000","user_id":111,"session_id":17878,"event":"CREATED","action":"","job_id":1,"msg":"create new job ===> id = 1"}
{"log_time":"01/01/2018 01:00:00.000","user_id":569,"session_id":54578,"event":"RANDOM","action":"","job_id":0,"msg":""}
{"log_time":"01/01/2018 02:00:00.000","user_id":569,"session_id":54578,"event":"RANDOM","action":"","job_id":0,"msg":""}
... RANDOM LOG
... RANDOM LOG
... RANDOM LOG
... RANDOM LOG
... RANDOM LOG
{"log_time":"30/01/2018 02:00:00.000","user_id":979,"session_id":78899,"event":"RANDOM","action":"","job_id":0,"msg":""}
{"log_time":"31/01/2018 00:00:00.000","user_id":111,"session_id":87896,"event":"CLOSED","action":"","job_id":1,"msg":"job id = 1 completed"}
In above logs user with user_id = 111 CREATED new job/ticket/record (job_id=1) in application on 01/01/2018. and on 31/01/2018 user CLOSED that job/ticket/record (job_id=1).
now there are 100 * 60 * 60 * 24 records are getting generated per day. and I want information about individual job on the fly e.g. job_id=1
- number of job/ticket/record got CREATED/CLOSED in 1 day/month/year
- elapsed time between two events (job_id=1 and event=CREATED/CLOSED)
- average time spent on job/ticket/record by 1 person (user_id = 111)
Is this possible to implement in logstash ? or is there any other way to achieve this?
and @Badger if you still think this is possible using logstash aggregation example 1, then could you explain in detail a little bit. because I am not able to figure it out.
I have other solution which achieves all of the above scenarios, but I have to update back-end application for that. when I receive CLOSED event I will add CREATE event time (start_time) in same log and I just need to add 1 script_field in kibana as follows.
log: -
{"log_time":"31/01/2018 00:00:00.000","user_id":111,"session_id":87896,"event":"CLOSED","action":"","job_id":1,"msg":"job id = 1 completed", "start_time":"01/01/2018 00:00:00.000"}
script_field: -
if(doc['start_time'].size()!=0) {
return (doc['log_time'].value.millis - doc['start_time'].value.millis) / 1000 / 60 ;
} else {
return null;
}
In below image i am calculating elapsed time (minutes) in between LOGGEDIN and LOGGEDOUT events using script_field feature in kibana. (time_span column)