Computing session durations from timestamps scattered over two documents

Background

Clients connect to my Websocket-Server. I have logs for both join & disconnect events in the following format (logs redacted):

{
  "_index": "filebeat-7.7.0-2020.05.22-000001",
  "_type": "_doc",
  "_id": "5CAqS3IBb2WkqqHd1BPf",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2020-05-25T09:29:48.389Z",
    "message_decoded": {
      "data": {
        "uuid": "b45739b9-8323-43e5-b8f1-50abb67962e9"
      },
      "type": "join"
  },
    "fields": {
      "@timestamp": [
        "2020-05-25T09:29:48.389Z"
    ]
  },
}

The disconnect-log looks similar. Importantly, I have the @timestamp and uuid fields.

What I'm trying to do

I would like to compute the average session duration for my clients. Filtering for any uuid yields two documents (uuid is session unique), the difference between the two @timestamps is what I am trying to compute.

I want to plot session durations in a line chart / vertical bar chart. I also want to compute the average session duration.

What I have come up with so far

I started by bucketing uuids using the Terms-aggregator. This creates buckets of size 2.

Next, I need to somehow reduce each bucket into an average session duration. However, I cannot seem to find a metric that does this in the Kibana interface. I have looked into the following options:

  1. Use a painless-script to compute the date difference. However, I am unsure how to use a script to compute over an entire bucket. The examples I find compute run a script per document. I have seen that I can include scripts as custom JSON inside Kibana, but am still unsure as to what that script would look like.
  2. Use Bucket Script Aggregation. However it says in the documentation:

The specified metric must be numeric and the script must return a numeric value.

My question

What is the easiest way to compute session durations, as outlined above? Could anybody point me in the right direction here?

Hi,

have a look at data transforms. Your direction towards using painless is right, with transform you can create sessions out of events and you use painless to extract. e.g. session duration.

A similar example is discussed in this webinar. I think you should be able to take the example code provided and customize it to your needs.

(The webinar might be a bit old, lots of new functionality has been added meanwhile, therefore take a look into the more recent docs, too)

Hope that helps!

1 Like

Hey Hendrik,

thanks a lot for your answer, you pointed in exactly the right direction! The webinar also proved to be very helpful.

I ended up using a data transform (which I already love!) and some scripting to compute session durations. Now for some reason, the Kibana interface didn't let me specify the script from the "Advanced Pivot Editor" (I might have done something wrong), but using the "Dev tools", creating the continuous data transform worked flawlessly.

For reference, I am pasting the API-call here.

PUT _transform/sessions_continuous
{
  "source": {
    "index": [
      "filebeat-7.7.0-2020.05.22-000001"
    ]
  },
  "pivot": {
  "group_by": {
    "message_decoded.data.uuid": {
      "terms": {
        "field": "message_decoded.data.uuid"
      }
    }
  },
  "aggregations": {
    "@timestamp_max": {
      "max": {
        "field": "@timestamp"
      }
    },
    "@timestamp_min": {
      "min": {
        "field": "@timestamp"
      }
    },
    "duration": {
      "bucket_script": {
        "buckets_path": {
          "min": "@timestamp_min",
          "max": "@timestamp_max"
        },
        "script": "(params.max - params.min)/1000"
      }
    }
  }
},
"description": "Session durations",
  "dest": {
    "index": "continuous_sessions"
  },
  "frequency": "10s",
  "sync": {
	"time": {
		"field": "@timestamp",
		"delay": "10s"
	}
}
}

Again, thanks a lot for your help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.