I have documents of the form
{
"item": "name",
"@timestamp": <date>,
"category": [
{ "key": "something", "doc_count": num },
{ "key": "otherthing", "doc_count": num },
{ "key": "thirdthing", "doc_count": num }
]
}
Mapping has [category][key] as keyword and [category][doc_count] as long.
There are a few thousand documents for each value of [item], with a lot of repetition of the values for key but variation in the value for doc_count. The data is derived from 5-minutely snapshots of high traffic web logs. I'm trying to run something like a terms agg, but I want to add up all the values of doc_count rather than just counting the instances of key itself. Ultimately I want to show the top N values based on those totals. (I realise the counts will be approximate a la sharding approximations - that's fine)
It's unclear to me whether I should be trying a nested agg combo of some sort, or perhaps a terms agg with a sum subagg. A little clue as to how I should get started would be very much appreciated.
tia,
Tom