Combine term and bucket range query

I'm attempting to extract records of http success/failure data per user using an elasticsearch aggregation.

I'm looking at two fields, "user.name" and "http.response.status_code". My goal is to use a keyed range bucket aggregation for the http response codes to label the 200s as "success" and the 400s as "failure" ignoring all other codes. I then want to aggregate these on a per user basis so that my output would look something like:

      "buckets": [
        {
          "key": [
            "user1",
            "success"
          ],
          "key_as_string": "user2|success",
          "doc_count": 1766
        },
        {
          "key": [
            "user1",
            "failure"
          ],
          "key_as_string": "user1|failure",
          "doc_count": 245
        }
      ]

I've done both separately using a multiterm aggregation and a range bucket aggregation but is there a way to combine two different kinds of aggregations into one?

I can always resort to bucketing the ranges and dropping the excess values with a script, but I'd prefer to do it all within the query if possible.

Thanks in advance,
Alex

Yes, you can combine different kinds of aggregations into one. In your case, you can use a terms aggregation on the "user.name" field and then a sub-aggregation with a range aggregation on the "http.response.status_code" field. Here is an example of how you can do it:

GET /_search
{
  "size": 0,
  "aggs": {
    "users": {
      "terms": {
        "field": "user.name"
      },
      "aggs": {
        "response_status": {
          "range": {
            "keyed": true,
            "field": "http.response.status_code",
            "ranges": [
              {
                "key": "success",
                "from": 200,
                "to": 300
              },
              {
                "key": "failure",
                "from": 400,
                "to": 500
              }
            ]
          }
        }
      }
    }
  }
}

This will give you a response where each user has a separate bucket, and within each user's bucket, there are sub-buckets for "success" and "failure" based on the HTTP response status code. Please note that the range is half-open, meaning it includes the "from" value and excludes the "to" value. So, for example, a status code of 200 will be included in the "success" range, but a status code of 300 will not.

Please replace the index name and run this query.

OpsGPT.io helped with part of this answer :slight_smile:

Exactly what I was looking for. Thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.