Suppose that we are hat maker and have an Elasticsearch index where each document corresponds to the sale of one hat. Part of the sales record is the name of the store at which the hat was
sold. I want to find out the number of hats sold by each store, and the average number of hats sold
over all stores. The best way I have been able to figure out is this search:
GET hat_sales/_search
{
"size": 0,
"query": {"match_all": {}},
"aggs": {
"stores": {
"terms": {
"field": "storename",
"size": 65536
},
"aggs": {
"sales_count": {
"cardinality": {
"field": "_id"
}
}
}
},
"average_sales_count": {
"avg_bucket": {
"buckets_path": "stores>sales_count"
}
}
}
}
(Aside: I set the size to 65536 because that is the default maximum number of buckets.)
The problem with this query is that the sales_count
aggregation performs a redundant calculation: each stores
bucket already has a doc_count
property. But how can I access this doc_count
in a buckets path?