Aggregation - Calculate the number of distinct values


#1

Hello,

I'm doing an aggregation over all the documents of my index. The problem is that Elasticsearch doesn't return all the buckets because of the default size. How i can find out the number of all the distinct buckets key in order to know which size to put on the aggregation.

This is the query:

{"size": 0,
"aggs" : {
"genres" : {
"nested" : {
"path" : "tags"
},
"aggs": {
"key_value":{
"terms" : {
"field" : "tags.values.value"
}
}
}}}}

and the result looks like:

},
"aggregations": {
"genres": {
"doc_count": 285391,
"key_value": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 46823,
"buckets": [
{
"key": "null",
"doc_count": 251811
},
{
"key": "mmakku",
"doc_count": 15870
},
{
"key": "mmbaden",
"doc_count": 15870
},
{
"key": "mmettlingen",
"doc_count": 15870
},
{
"key": "mmeurope",
"doc_count": 15870
},
{
"key": "mmgermany",
"doc_count": 15870
},
{
"key": "mmkarlsruhe",
"doc_count": 15870
}

Thank you!


(Junaid) #2

@Alexandra1 I would recommend you to look at cardinality aggregations. Documentation for that can be found here.


#3

Here is the sample, suppose to get distinct members based on tag_id
GET index_name/_search
{
"size" : 0,
"aggs" : {
"distinct_members" : {
"cardinality" : {
"field" : "tag_id.keyword"
}
}
}

}


#4

Seems to work with cardinality. Thanks a lot!


#5

cheers!! mark it as a solution