I have some logs in my elasticsearch database formated like this:
Timestamp | ID | Command
------------------------------------+-------+---------------------------
November 24th 2017, 14:49:00.11614 | 0 | CONNECTION_REQUEST
November 24th 2017, 14:49:00.13510 | 1 | CONNECTION_REQUEST
November 24th 2017, 14:49:00.18714 | 0 | CONNECTION_COMPLETE
November 24th 2017, 14:49:00.26010 | 1 | CONNECTION_COMPLETE
November 24th 2017, 14:50:20.7850 | 2 | CONNECTION_REQUEST
November 24th 2017, 14:50:20.8051 | 2 | CONNECTION_REQUEST
November 24th 2017, 14:50:20.8450 | 2 | CONNECTION_COMPLETE
There are connection_requests and connection_completed messages that both specify an ID. It can occur that A connection_request is retransmitted before the connection_complete happened. There are also logs in between with other commands but they are not concidered here.
What I want to calculate is the time between the first connection_request and the connection_complete for each ID
e.g.:
Time_0 = November 24th 2017, 14:49:00.18714 - November 24th 2017, 14:49:00.11614 = 7100ms
Time_1 = November 24th 2017, 14:49:00.26010 - November 24th 2017, 14:49:00.13510 = 12500ms
Time_2 = November 24th 2017, 14:50:20.8450 - November 24th 2017, 14:50:20.7850 = 600ms
I can make a bucket of the CONNECTION_COMPLETE logs but then how do I get the first occurrence of CONNECTION_REQUEST with the same ID before the timestamp of the CONNECTION_COMPLETE. I don't really know what aggregator(s) to use and how to make them interact with each other