Total number of buckets after bucket selector aggregation filter

POST alloy/_search
{
  "size": 0,
  "aggs": {
    "groupByResId": {
      "terms": {
        "field": "resourceId",
        "size": 100,
        "order": [
          {
            "max_score_aggr": "desc"
          }
        ]
      },
      "aggs": {
        "max_score_aggr": {
          "max": {
            "script": {
              "inline": "_score"
            }
          }
        },
        "total_children": {
          "sum": {
            "script": {
              "inline": "doc['termResIds'].values.length"
            }
          }
        },
        "total_children_filter": {
          "bucket_selector": {
            "buckets_path": {
              "total_children": "total_children"
            },
            "script": {
              "params": {
                "numberCondition": 0
              },
              "inline": "params.total_children > params.numberCondition"
            }
          }
        },
        "count_resources": {
          "cardinality": {
            "field": "resourceId"
          }
        }
      }
    },
    "sss": {
      "sum_bucket": {
        "buckets_path": "groupByResId>count_resources"
      }
    },
    "count_resources": {
      "cardinality": {
        "field": "resourceId"
      }
    }
  },
  "query": {
    ...
  }
}

Hi, I'm currently trying to write a query using Elasticsearch 5.6. The purpose of the query is to obtain a subset of data (the query conditions have been omitted for simplicity).
I then need to merge the documents with the same resource id into buckets and have the sum of the number of elements in a certain array.
Finally only the buckets whose sum is greater than a certain number will be considered.

Up to here things work quite well. The only problem left is that I also need to have the total number of buckets (or the total number of resource ids) that satisfy the above set of conditions. I was trying to use the sum_bucket aggregation to count the returned buckets to do this, however it appears that it does what I want but ONLY considering the buckets actually returned in the response which is dependent on the aggregation size parameter.
However I would only like the first 100 buckets to be returned and not all of them while still having the correct count.

Is there a way to achieve the described result? Thank you in advance for your time!

1 Like

Bump

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.