Sum aggregation on field by distinct value of another field with high cardinality

I have a document like this:

{
    "_id": 1, 
    "factor": 100,
    "field_with_high_cardinality": 1000
}

I know that for each unique field_with_high_cardinality the factor is always the same.

I need to calculate sum aggregation of factor field considering only unique (distinct) values of field_with_high_cardinality .

I tried to calculate terms for field_with_high_cardinality , then avg for each term (since all factor fields are same, avg will give me the risk field back). Then sum_bucket on avarages:

"aggs": {
    "terms_agg": {
      "terms": {
        "field": "field_with_high_cardinality",
        "size": 1000000
      },
      "aggs": {
        "avg_risks": {
          "avg": {
            "field": "factor"
          }
        }
      }
    },
    "sums":{
      "sum_bucket": {
        "buckets_path": "terms_agg.avg_risks"
      }
    }
}

But since I have a lot of field_with_high_cardinality values (almost 99% are unique) I get 2 problems:

  1. Unacceptably long execution time
  2. All risk_terms buckets are returned as part of the response, but I need only the sums field.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.