Aggregation result size


(Suriya) #1

Hello!

So, I'm first aggregating by interface name and then performing an average value aggregation on a particular field. I sort the output as descending to get the top 10 average values by interface name. However, these values will all be slightly off if I take the size of the aggregation as 10, because of how aggregations work first on a per shard level. (See: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html)

In order to get more accurate outputs, I could increase the size of the aggregation, say to a 100, but that will then output the top 100 interfaces. Is there a way to calculate the average on however many interfaces I want (to make it more accurate), but still have Elasticsearch only give me the top 10 hits.

Thanks,
Suriya


(Colin Goodheart-Smithe) #2

Have a look at the shard_size parameter. This should help you. You can also look at the doc_count_error_upper_bound to see what the worst-case error in the document counts is.


(Suriya) #3

Thank you very much, this should do the trick!


(system) #4