I have a dataset with the mapping:
ad : long
tk : long
ts : date
I index some docs every hour and these docs have follow features:
ad: always be 999999
tk :Randomly fetches a value from a list of length 10
ts: current timestamp
when I using filter aggs on 'ad':
query1:
{ 'aggs': {'app_aggs': {'filter': {'term':{'ad':999999}}, 'aggs': {'date_aggs': {'date_histogram': {'field': 'ts', 'interval': 'day', 'time_zone': '+08:00', 'format': 'yyyy-MM-dd HH:mm:ss'}, 'aggs': {'uv_aggs': {'cardinality': {'field': 'tk'}}}}}}}}
result1:
{u'app_aggs': {u'date_aggs': {u'buckets': [{u'uv_aggs': {u'value': 10}, u'key_as_string': u'2017-02-12 00:00:00', u'key': 1486828800000, u'doc_count': 5176}]}, u'doc_count': 5176}}
Its ok that I have 10 values of tk.
But when I using term aggs on 'ad':
query2:
{ 'aggs': {'app_aggs': {'terms': {'field': 'ad'}, 'aggs': {'date_aggs': {'date_histogram': {'field': 'ts', 'interval': 'day', 'time_zone': '+08:00', 'format': 'yyyy-MM-dd HH:mm:ss'}, 'aggs': {'uv_aggs': {'cardinality': {'field': 'tk'}}}}}}}}
result2:
{u'app_aggs': {u'buckets': [{u'date_aggs': {u'buckets': [{u'uv_aggs': {u'value': 8}, u'key_as_string': u'2017-02-12 00:00:00', u'key': 1486828800000, u'doc_count': 5176}]}, u'key': 999999, u'doc_count': 5176}], u'sum_other_doc_count': 0, u'doc_count_error_upper_bound': 0}}
Why it reduce to 8 ??
then I replace the "cardinality":"tk" to "terms":"tk":
query:
{ 'aggs': {'app_aggs': {'terms': {"field":"ad"}, 'aggs': {'date_aggs': {'date_histogram': {'field': 'ts', 'interval': 'day', 'time_zone': '+08:00', 'format': 'yyyy-MM-dd HH:mm:ss'}, 'aggs': {'uv_aggs': {'terms': {'field': 'tk'}}}}}}}}
result:
{u'app_aggs': {u'buckets': [{u'date_aggs': {u'buckets': [{u'uv_aggs': {u'buckets': [{u'key': 96741171783034867, u'doc_count': 545}, {u'key': 125100716348049356, u'doc_count': 537}, {u'key': 82496289943871922, u'doc_count': 522}, {u'key': 79031552758613517, u'doc_count': 521}, {u'key': 109651577186779136, u'doc_count': 520}, {u'key': 75733601942997721, u'doc_count': 517}, {u'key': 125602239551986866, u'doc_count': 513}, {u'key': 115378095770568369, u'doc_count': 511}, {u'key': 126639628788471066, u'doc_count': 505}, {u'key': 121104902114270720, u'doc_count': 485}], u'sum_other_doc_count': 0, u'doc_count_error_upper_bound': 0}, u'key_as_string': u'2017-02-12 00:00:00', u'key': 1486828800000, u'doc_count': 5176}]}, u'key': 999999, u'doc_count': 5176}], u'sum_other_doc_count': 0, u'doc_count_error_upper_bound': 0}}
It does exist 10 buckets of "tk" !!!
why cardinality aggreration gives the wrong result??
the elasticsearch version is 2.4.2
Any advice is helpful,thank you.