Visualize counts on nested buckets


#1

I want to get the counts of nested groups which satisfy a certain condition.

Consider the following type of documents:
{
user-id: "abc"
session-id: "abc-123"
action: "A123"
}

I want to show the count of a specific "action" on a timeline if there a given "user-id" that has at least 2 distinct "session-id"s. The purpose is to tell, how many returning user ( session-ids >= 2 ) triggered a specific action ( action = "A123" ) on Mon., Thu., Wed...

In elastic terms I probably try to do the following:

  1. create a bucket for each "user-id"
  2. consider only buckets having at least 2 distinct "session-ids" vs. consider only buckets having exactly 1 distinct "session-ids".
  3. consider only buckets having a least one document matching (action: "A123")
  4. the aggregate count of remaining documents in a time-histogram

I read through elastic aggregations and pipelines but did not come to a solution in Kibana 6.2 yet. Yould you please point me in a direction? Any hints and ideas are highly appreciated.


#2

Here is what I currently tried:

DELETE usagestats

PUT usagestats/test/1
{
  "user-id": "abc",
  "session-id": "abc-1",
  "action": "A123"
}

PUT usagestats/test/2
{
  "user-id": "abc",
  "session-id": "abc-2",
  "action": "A123"
}

PUT usagestats/test/3
{
  "user-id": "xyz",
  "session-id": "xyz-1",
  "action": "A123"
}

PUT usagestats/test/4
{
  "user-id": "xyz",
  "session-id": "xyz-1",
  "action": "A123"
}

GET usagestats/test/_search?
{
    "aggs" : {
        "test" : {
            "terms" : { "field" : "user-id.keyword" }
        }
    }
}

(Bryan Stuhlsatz) #3

I think what you want is a pipeline bucket selector aggregation. Your first agg is on the userid. Your second is agg is on the session-id. You then do a bucket filter on session-ids that is >1. Then apply a filter on the visualization where action is A123

https://www.elastic.co/guide/en/elasticsearch/reference/6.1/search-aggregations-pipeline-bucket-selector-aggregation.html


#4

Thank you! I got it to work:

GET usagestats/test/_search?
{
   "aggs":{
      "user":{
         "terms":{
            "field":"user-id.keyword"
         },
         "aggs":{
            "session":{
               "terms":{
                  "field":"session-id.keyword"
               }
            },
            "session_filter":{
               "bucket_selector":{
                  "buckets_path":{
                     "sessions":"session._bucket_count"
                  },
                  "script":"params.sessions > 1"
               }
            }
         }
      }
   }
}

The output is:

"aggregations": {
    "user": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "abc",
          "doc_count": 2,
          "session": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "abc-1",
                "doc_count": 1
              },
              {
                "key": "abc-2",
                "doc_count": 1
              }
            ]
          }
        }
      ]
    }
  }

#5

This might be a very stupid question, but ... Can I use this to visualize the count of "user.buckets" in Kibana using Timelion?