New guy needs help visualizing lab active usage

So, I'm new to Kibana, and data visualization in general, but we are in need of being able to track and visualize usage/utilization in our computer labs.

I have records being collected that indicate, for each computer, the session id, user, and action (start/stop) and time.

So it looks something like:

Time                                    Action  ComputerName    SessionID       Username
February 28th 2020, 08:02:51.000        added   computer-09     46613           user1
February 27th 2020, 17:04:59.000        removed computer-09     53706           user1
February 27th 2020, 14:35:56.000        added   computer-01     52879           user2
February 27th 2020, 13:20:16.000        removed computer-01     124             user2
February 27th 2020, 12:39:24.000        added   computer-01     124             user2
February 27th 2020, 12:38:51.000        added   computer-09     53706           user1
February 27th 2020, 12:33:44.000        added   computer-07     189             user3

But I'm having trouble wrapping my head around how I get those to be used together, and how I would use them to do something like a line graph for "Average Utilization Per Day".

Screen Shot 2020-02-28 at 9.03.31 AM

Any help is appreciated. Even just basic concepts, direction, or resource recommendations!

Elasticsearch has some limitations that keep it fast, which we reflect in Kibana. You are running into one of the most common: doing time math across multiple document is a common use case, and our solution to the problem is continuous transforms

Ok, so I've made progress based on your direction. I now have a transform index with a scripted field created and populating that has data as such:

Time                        computer_name     username pid    timestamp_start             timestamp_stop              duration
Mar 5, 2020 @ 13:10:58.000  dept1-bldg123-01  user1    35552  Mar 5, 2020 @ 13:10:58.000  Mar 5, 2020 @ 13:10:58.000  a few seconds
Mar 5, 2020 @ 12:11:06.000  dept1-bldg123-09  user3    20639  Mar 5, 2020 @ 12:11:06.000  Mar 5, 2020 @ 12:11:06.000  a few seconds
Mar 4, 2020 @ 23:58:52.000  dept2-bldg456-02  user5    19977  Mar 4, 2020 @ 23:58:52.000  Mar 5, 2020 @ 00:04:25.000  6 minutes
Mar 4, 2020 @ 23:47:43.000  dept2-bldg456-02  user6    10869  Mar 4, 2020 @ 23:47:43.000  Mar 4, 2020 @ 23:53:18.000  6 minutes
Mar 4, 2020 @ 23:25:06.000  dept2-bldg456-03  user9    5376   Mar 4, 2020 @ 23:25:06.000  Mar 5, 2020 @ 03:25:06.000  4 hours
Mar 4, 2020 @ 23:19:31.000  dept2-bldg456-02  user2    97053  Mar 4, 2020 @ 23:19:31.000  Mar 4, 2020 @ 23:25:06.000  6 minutes
Mar 4, 2020 @ 22:51:29.000  dept2-bldg456-02  user8    81131  Mar 4, 2020 @ 22:51:29.000  Mar 4, 2020 @ 23:02:37.000  11 minutes
Mar 4, 2020 @ 14:51:52.000  dept2-bldg456-06  user7    2296   Mar 4, 2020 @ 14:51:52.000  Mar 4, 2020 @ 16:29:07.000  2 hours
Mar 4, 2020 @ 00:07:29.000  dept2-bldg456-06  user10   45844  Mar 4, 2020 @ 00:07:29.000  Mar 4, 2020 @ 00:23:01.000  16 minutes

This real(sanitized) data from my test pool.

So now, I am hitting my next stumbling block trying to figure how I visualize this, with it allocating hours to the appropriate days when the range crosses midnight (multiple days). So for example, if I'm wanting to show overall usage per day in hours, then a session like the one for 'user9' where it starts at 11:36 PM and continues for 4 hours causes trouble. How would I get it to split and count the first 35 minutes on Mar 4, and the remaining 3 hours 25 minutes on Mar 5?

So far, I've only been able to get it to apply the entire duration to either one day or the other.

I've been looking into this a bit, and I think you might be running into a calculation that would be difficult to do in any kind of visualization tool. For example, I tried building this as a Vega visualization, but was unable to calculate daily totals. I got this visualization:

Basically, the analysis you are looking for involves too many steps of post-processing to be easily done within the tools. I suggest that you do this processing in a script outside of Elasticsearch.

Here is the Vega config I used, if you're interested in trying this out:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "width": "container",
  "height": "container",
  "data": {
    "values": [
{ "dept": "dept1-bldg123-01", "user": "user1",  "pid": 35552, "start": "03/05/2020 13:10:58.000", "end": "03/05/2020 13:10:58.000" },
{ "dept": "dept1-bldg123-09", "user": "user3",  "pid": 20639, "start": "03/05/2020 12:11:06.000", "end": "03/05/2020 12:11:06.000" },
{ "dept": "dept2-bldg456-02", "user": "user5",  "pid": 19977, "start": "03/04/2020 23:58:52.000", "end": "03/05/2020 00:04:25.000" },
{ "dept": "dept2-bldg456-02", "user": "user6",  "pid": 10869, "start": "03/04/2020 23:47:43.000", "end": "03/04/2020 23:53:18.000" },
{ "dept": "dept2-bldg456-03", "user": "user9",  "pid": 5376,  "start": "03/04/2020 23:25:06.000", "end": "03/05/2020 03:25:06.000" },
{ "dept": "dept2-bldg456-02", "user": "user2",  "pid": 97053, "start": "03/04/2020 23:19:31.000", "end": "03/04/2020 23:25:06.000" },
{ "dept": "dept2-bldg456-02", "user": "user8",  "pid": 81131, "start": "03/04/2020 22:51:29.000", "end": "03/04/2020 23:02:37.000" },
{ "dept": "dept2-bldg456-06", "user": "user7",  "pid": 2296,  "start": "03/04/2020 14:51:52.000", "end": "03/04/2020 16:29:07.000" },
{ "dept": "dept2-bldg456-06", "user": "user10", "pid": 45844, "start": "03/04/2020 00:07:29.000", "end": "03/04/2020 00:23:01.000" }
  ] },
  "transform": [{
    "calculate": "utcParse(datum.start, '%m/%d/%Y %H:%M:%S.%L')",
    "as": "starttime"
  }, {
    "calculate": "utcParse(datum.end, '%m/%d/%Y %H:%M:%S.%L')",
    "as": "endtime"
  }],
  "mark": "bar",
  "encoding": {
    "y": { "field": "dept", "type": "nominal" },
    "x": { "field": "starttime", "type": "temporal", "scale": { "padding": 20 }, "stack": true },
    "x2": { "field": "endtime" }
  }
}

@wylie thanks much for looking into it, and for your advice. I’ll look into possibly splitting it up via a script of some kind.

There is one other possibility that is worth mentioning, but I'm not sure it helps you in the short term. Elasticsearch offers a date_range data type which supports start/end pairs like you have. It's not supported in Kibana, but is usable through Vega. Because the date histogram aggregation can work on ranges, you may be able to get a bar chart showing total usage per day by using this datatype.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.