Nested querying like SQL

(Tomás Law) #1


I have logs with the fields client_ip, URI and timestamp.

I want to be able to group these logs by sessions. That is, a session corresponds to a group of logs with the same IP and with consecutive and similar timestamps.

The objective is to show all the Uris that each IP accessed ordered by timestamp.

For example
IP URI Timestamp ....T08:30:00 ....T08:31:00 ....T12:45:00 ....T12:47:00

From those logs, I want to be able to group the session with google and yahoo, and the session with Facebook and YouTube.

The end goal is to count how many of each sessions occurred.

Thank you

(Marius Dragomir) #2

Getting to the end goal here might be almost impossible in Kibana, but we can try and do it step by step. I'll try and give some pointers in regards to what aggregations you can use to achieve the steps.
We should probably start with a data table, that seems the most basic form of looking at data.

  1. Do a Terms aggregation by client_ip in order to split them all by IP.
  2. You can do a date histogram as well, where you set the bucket size in seconds/minutes/etc according to what you want to set as session lenght
  3. In order to group them together, you can do a Filter aggregation where you can combine them.
    At the end, the Data Table has a sum field as well where you can see the count.

(Tomás Law) #3

Hello Marius,

Thanks for you reply. Using a date histogram doesn't satisfy the requirements :frowning:

If I group them by pre-defined time intervals, the logs won't be grouped by sessions.
For example, if I group them hourly:

LogId Timestamp
1 ...T08:58:32
2 ...T09:01:15

These 2 logs will be grouped into different sessions even though they belong to the same one :frowning:

The end goal is to show, for each IP address, all the LogId's separated by session.
I have this:


Using elasticsearch, would it be possible to somehow create an index from the existing one?
This way, I could:
. Group all the events by ClientIp
. Order by Timestamp
. Sequentially, go through each log and store the timestamp of each log as a field SessionEnd of the aggregation
. Therefore, I could compare the timestamp of each log to (SessionEnd+30minutes) to check whether I should group that log with the existing session or create a new session.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.