Elasticsearch term aggregations : use case for a single field

Hello everyone,

I try to understand term aggregations in Elasticsearch, and I don't know how to fix an particular use case : let me explain !
I try to catch all the element and their "key" values of a single field within my index.

Here it's my aggs request :
POST bigginsight/_search?track_total_hits=true
{
"aggs": {
" all_values__downloadTotal": {
"terms": {
"field": "downloadTotal"
}
}
},
"size": 0
}

and the result of it :
"aggregations": {
" all_values__downloadTotal": {
"doc_count_error_upper_bound": 77,
** "sum_other_doc_count": 206696,**
"buckets": [
{
"key": 0,
"doc_count": 35920
},
{
"key": 1791,
"doc_count": 35
},
{
...
...

And, i would like to know how to avoid this row to happen :"doc_count_error_upper_bound": 77,
** "sum_other_doc_count": 206696,
Because I want all the elements and their value in this field, such as an aray with "key" values and the number of elements which has it.
Maybe I'm using the wrong kind of request ?

If somebody can help me, I will be glad to hear from him,

Thanks in advance ! And good day !

@vic22
Since you haven't specified size for terms aggregation it will default to 10. Results indicate that your index has 206,696 (sum_other_doc_count) documents that are not in these 10 buckets (0, 1791 etc.) As you increase size this number will drop. But query will be more expensive. If you are happy with top 10 results, you don't need to worry about this number.

Doc counts are approximate. doc_count_error_upper_bound = 77 means, doc_count for any bucket may be off maximum by 77. There is a shard_size parameter for the terms aggregation. If you increase value you will be able to reduce error upper bound. But query will be more expensive. For high cardinality field making it 0 will be cost prohibitive.

If you really need counts to be 100% accurate for the top 10, you can first get top 10 then run same query again with additional filter on downloadTotal field to include only top 10 values (0, 1791 etc).

see https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_calculating_document_count_error

Thanks you very much, I truly understand my mistake.
Have a good end of week !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.