I have documents of the form
{
"item": "name",
"@timestamp": <date>,
"category": [
{ "key": "something", "doc_count": num },
{ "key": "otherthing", "doc_count": num },
{ "key": "thirdthing", "doc_count": num }
]
}
Mapping has [category][key] as keyword
and [category][doc_count] as long
.
There are a few thousand documents for each value of [item], with a lot of repetition of the values for key
but variation in the value for doc_count
. The data is derived from 5-minutely snapshots of high traffic web logs. I'm trying to run something like a terms
agg, but I want to add up all the values of doc_count
rather than just counting the instances of key
itself. Ultimately I want to show the top N values based on those totals. (I realise the counts will be approximate a la sharding approximations - that's fine)
It's unclear to me whether I should be trying a nested
agg combo of some sort, or perhaps a terms
agg with a sum
subagg. A little clue as to how I should get started would be very much appreciated.
tia,
Tom