How to do pipeline range aggregation on terms aggregation?


(Phoenixgao) #1

For example,

I have "events" indexed into elasticsearch, like,

    {
        "event": "visit",
        "user_id": 123,
        "timestamp": "2016-08-16T00:02:03"
    }

With bucket terms aggregation on field user_id I can get the how many times a user visits during a specific time period

    "aggregations" : {
        "user_visits" : {
            "doc_count_error_upper_bound": 0, 
            "sum_other_doc_count": 0, 
            "buckets" : [ 
                {
                    "key" : 1,
                    "doc_count" : 13
                },
                {
                    "key" : 2,
                    "doc_count" : 22
                },
                {
                    "key" : 3,
                    "doc_count" : 26
                },
                {
                    "key" : 4,
                    "doc_count" : 35
                },
            ]
        }
    }

while the key is user id, but the data I actually want is a range aggregation on how many users have visited between x to y times and how many between y to z:

// This is the expected results:
    "aggregations": {
        "user_visit_ranges" : {
            "buckets": [
                {
                    "to": 10,
                    "doc_count": 0
                },
                {
                    "from": 10
                    "to": 20,
                    "doc_count": 1
                },
                {
                    "from": 20,
                    "to": 30,
                    "doc_count": 2
                },
                {
                    "from": 30,
                    "doc_count": 1
                }
            ]
        }
    }

Is it possible to do this kind of aggregation in elasticsearch (without client scripting) and how?

Thanks


(Gustavo Orsi) #2

why do you need pipeline for this ? can't just use "date_range" with a nested "terms" or "cardinality" ?

{
  "size": 0,
  "aggs": {
    "filter_event_type": {
      "filter": {
        "term": {
          "event": "visit"
        }
      },
      "aggs": {
        "user_visit_ranges": {
          "date_range": {
            "field": "timestamp",
            "ranges": [
              {
                "from": "2016-03-17T00:02:03",
                "to": "2016-08-21T00:02:03",
                "key": "first-period"
              },
              {
                "from": "2016-08-22T00:02:03",
                "to": "2016-12-17T00:02:03",
                "key": "second-period"
              }
            ]
          },
          "aggs": {
            "user_visits": {
              "terms": {
                "field": "user_id"
              }
            }
          }
        }
      }
    }
  }
}

(system) #4