The problem when using cardinality aggs!

I have a dataset with the mapping:
ad : long
tk : long
ts : date

I index some docs every hour and these docs have follow features:
ad: always be 999999
tk :Randomly fetches a value from a list of length 10
ts: current timestamp

when I using filter aggs on 'ad':
query1:
{ 'aggs': {'app_aggs': {'filter': {'term':{'ad':999999}}, 'aggs': {'date_aggs': {'date_histogram': {'field': 'ts', 'interval': 'day', 'time_zone': '+08:00', 'format': 'yyyy-MM-dd HH:mm:ss'}, 'aggs': {'uv_aggs': {'cardinality': {'field': 'tk'}}}}}}}}
result1:
{u'app_aggs': {u'date_aggs': {u'buckets': [{u'uv_aggs': {u'value': 10}, u'key_as_string': u'2017-02-12 00:00:00', u'key': 1486828800000, u'doc_count': 5176}]}, u'doc_count': 5176}}

Its ok that I have 10 values of tk.

But when I using term aggs on 'ad':
query2:
{ 'aggs': {'app_aggs': {'terms': {'field': 'ad'}, 'aggs': {'date_aggs': {'date_histogram': {'field': 'ts', 'interval': 'day', 'time_zone': '+08:00', 'format': 'yyyy-MM-dd HH:mm:ss'}, 'aggs': {'uv_aggs': {'cardinality': {'field': 'tk'}}}}}}}}

result2:
{u'app_aggs': {u'buckets': [{u'date_aggs': {u'buckets': [{u'uv_aggs': {u'value': 8}, u'key_as_string': u'2017-02-12 00:00:00', u'key': 1486828800000, u'doc_count': 5176}]}, u'key': 999999, u'doc_count': 5176}], u'sum_other_doc_count': 0, u'doc_count_error_upper_bound': 0}}

Why it reduce to 8 ??

then I replace the "cardinality":"tk" to "terms":"tk":

query:
{ 'aggs': {'app_aggs': {'terms': {"field":"ad"}, 'aggs': {'date_aggs': {'date_histogram': {'field': 'ts', 'interval': 'day', 'time_zone': '+08:00', 'format': 'yyyy-MM-dd HH:mm:ss'}, 'aggs': {'uv_aggs': {'terms': {'field': 'tk'}}}}}}}}

result:
{u'app_aggs': {u'buckets': [{u'date_aggs': {u'buckets': [{u'uv_aggs': {u'buckets': [{u'key': 96741171783034867, u'doc_count': 545}, {u'key': 125100716348049356, u'doc_count': 537}, {u'key': 82496289943871922, u'doc_count': 522}, {u'key': 79031552758613517, u'doc_count': 521}, {u'key': 109651577186779136, u'doc_count': 520}, {u'key': 75733601942997721, u'doc_count': 517}, {u'key': 125602239551986866, u'doc_count': 513}, {u'key': 115378095770568369, u'doc_count': 511}, {u'key': 126639628788471066, u'doc_count': 505}, {u'key': 121104902114270720, u'doc_count': 485}], u'sum_other_doc_count': 0, u'doc_count_error_upper_bound': 0}, u'key_as_string': u'2017-02-12 00:00:00', u'key': 1486828800000, u'doc_count': 5176}]}, u'key': 999999, u'doc_count': 5176}], u'sum_other_doc_count': 0, u'doc_count_error_upper_bound': 0}}

It does exist 10 buckets of "tk" !!!
why cardinality aggreration gives the wrong result??

the elasticsearch version is 2.4.2

Any advice is helpful,thank you.

What's the cardinality of the data like?
See https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-metrics-cardinality-aggregation.html#_counts_are_approximate

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.