POST alloy/_search
{
"size": 0,
"aggs": {
"groupByResId": {
"terms": {
"field": "resourceId",
"size": 100,
"order": [
{
"max_score_aggr": "desc"
}
]
},
"aggs": {
"max_score_aggr": {
"max": {
"script": {
"inline": "_score"
}
}
},
"total_children": {
"sum": {
"script": {
"inline": "doc['termResIds'].values.length"
}
}
},
"total_children_filter": {
"bucket_selector": {
"buckets_path": {
"total_children": "total_children"
},
"script": {
"params": {
"numberCondition": 0
},
"inline": "params.total_children > params.numberCondition"
}
}
},
"count_resources": {
"cardinality": {
"field": "resourceId"
}
}
}
},
"sss": {
"sum_bucket": {
"buckets_path": "groupByResId>count_resources"
}
},
"count_resources": {
"cardinality": {
"field": "resourceId"
}
}
},
"query": {
...
}
}
Hi, I'm currently trying to write a query using Elasticsearch 5.6. The purpose of the query is to obtain a subset of data (the query conditions have been omitted for simplicity).
I then need to merge the documents with the same resource id into buckets and have the sum of the number of elements in a certain array.
Finally only the buckets whose sum is greater than a certain number will be considered.
Up to here things work quite well. The only problem left is that I also need to have the total number of buckets (or the total number of resource ids) that satisfy the above set of conditions. I was trying to use the sum_bucket aggregation to count the returned buckets to do this, however it appears that it does what I want but ONLY considering the buckets actually returned in the response which is dependent on the aggregation size parameter.
However I would only like the first 100 buckets to be returned and not all of them while still having the correct count.
Is there a way to achieve the described result? Thank you in advance for your time!