I have logs with the fields:
Each log shows the time (timestamp) the URI was visited. The SessionId is a field I created by concatenating IP with Timestamp (without the minutes or seconds). That is, all the logs belonging to the same day and hour are grouped into the same SessionId.
IP Timestamp URI SessionId
18.104.22.168 December25th2017T08:30:00 google.com 22.214.171.124_25-12-2017_8
126.96.36.199 December25th2017T08:45:00 youtube.com 188.8.131.52_25-12-2017_8
184.108.40.206 December25th2017T09:30:00 facebook.com 220.127.116.11_25-12-2017_9
18.104.22.168 December25th2017T09:58:00 youtube.com 22.214.171.124_25-12-2017_9
126.96.36.199 December30th2017T15:20:00 yahoo.com 188.8.131.52_30-12-2017_15
184.108.40.206 December30th2017T15:30:00 elastic.com 220.127.116.11_30-12-2017_15
18.104.22.168 December30th2017T15:58:00 youtube.com 22.214.171.124_30-12-2017_15
126.96.36.199 December30th2017T09.35:00 bing.com 188.8.131.52_30-12-2017_9
184.108.40.206 December30th2017T09.59:00 google.com 220.127.116.11_30-12-2017_9
Now, I need to know the top URI that were the last ones to be visited in the sessions.
That is, for the example above:
-> "youtube.com" was the last URI visited in the following sessions:
-> "google.com" was the last URI visited in the following session:
So far, I've managed to get the last visited URL's for each session by doing a terms aggregation on session_id and then a top_hits aggregation on timestamp.
The problem is that I can't group them by URL (to compute the count for each URL) because top_hits aggregation doesn't allow any subaggregations...
How can I solve this?