I have to split documents in index based on field values, and then calculate sum of other field's value in each bucket, and sort by buckets by this resulting sum.
I came up with two approaches to solve this:
{
"size": 0,
"aggs": {
"by_vendor": {
"terms": {
"field": "vendorId"
},
"aggs": {
"sum_per_vendor": {
"sum": {
"field": "count"
}
},
"vendor_sort": {
"bucket_sort": {
"sort": [
{
"sum_per_vendor": {
"order": "desc"
}
}
]
}
}
}
}
}
}
And another:
{
"size": 0,
"aggs": {
"by_vendor": {
"terms": {
"field": "vendorId",
"order": {
"sum_per_vendor": "desc"
}
},
"aggs": {
"sum_per_vendor": {
"sum": {
"field": "count"
}
}
}
}
}
}
Both queries produce different results. First few buckets have the same count in both queries, but other are missing in first query result. I suspect the first query is somehow incorrect, but can someone help me figure out why?
My cluster consists with two nodes, although it seems all data stored only in one of them. Index size is 2,000,000 documents. And I am using Elasticsearch 7.6.