How to filter top level aggregation bucket using lower level aggregation results


(Gab Klein) #1

Hi guys,

I have a question on how can I use aggregation filter from a top level aggregation using lower level aggregation result field ?

My data:

[ 
  { id: 1, product_id: 1, price: 6 },
  { id: 2, product_id: 2, price: 5 },
  { id: 3, product_id: 1, price: 4 },
  { id: 4, product_id: 2, price: 8 },
  { id: 5, product_id: 3, price: 7 }
]

Query I would like to run:

{
  "size": 0,
  "aggs": {
    "products": {
      "filter": {        
        "range": {
          "price_sum": { //Using lower level aggregation filed
            "gte" : 10,
            "lte" : 20
          }
        }
      },
      "aggs": {
        "products_ids": {
          "terms": {
            "field": "product_id",
            "size": 0
          },
          "aggs": {
            "price_sum": {
              "sum": {
                "script": "doc['price'].value.toInteger()"
              }
            }
          }
        }
      }
    }
  }
}

Result expected:

product_ids: [
  { key: 1,  price_sum: 10 },
  { key: 2,  price_sum: 13 }
]

(Mark Walkom) #2

You can't, you need to figure that out first.


(Colin Goodheart-Smithe) #3

You could get to what you are after by using the bucket_selector pipeline aggregation with something like the below.

This comes with the usual caveats of using scripting which you can read bout in the documentation here

Also, it is not recommended to use "size": 0 with the terms aggregation since for high cardinality fields (like a field for product id) it can cause memory issues.

{
  "size": 0,
  "aggs": {
    "products_ids": {
      "terms": {
        "field": "product_id",
        "size": 0
      },
      "aggs": {
        "price_sum": {
          "sum": {
            "script": "doc['price'].value.toInteger()"
          }
        },
        "bucket_filter": {
          "bucket_selector": {
            "buckets_path": {
              "price_sum": "price_sum"
            },
            "script": "price_sum <= 20 && price_sum >= 10"
          }
        }
      }
    }
  }
}

The output for the above aggregation using your example data is:

"products_ids": {
  "doc_count_error_upper_bound": 0,
  "sum_other_doc_count": 0,
  "buckets": [
    {
      "key": 1,
      "doc_count": 2,
      "price_sum": {
        "value": 10
      }
    },
    {
      "key": 2,
      "doc_count": 2,
      "price_sum": {
        "value": 13
      }
    }
  ]
}

(system) #4