Why i got different result from these two DSL?

#1
{
"size": 0,
"aggs": {
"by_days": {
"filter": {
"bool": {
"must": {"term": {"AppID": 5091}},
"must": {"range": {
"RecordTime": {
"gte": 1479312000,
"lt": 1479398400
}}
}
}
},
"aggs": {
"day_data": {
"date_histogram": {
"field": "RecordTime",
"interval": "day",
"format": "yyyy-MM-dd",
"time_zone": "+08:00"
},
"aggs": {
"distinct_user": {
"cardinality": {
"script": "doc['Channel'].value + ' ' + doc['AreaServerID'].value + ' ' + doc['RoleID'].value"
}
}
}
}
}
}
}
}

#result

{
"key_as_string": "2016-11-17",
"key": 1479312000000,
"doc_count": 22636,
"distinct_user": {
"value": 3103
}
}

#2

{
"size": 0,
"aggs": {
"by_days": {
"filter": {
"bool": {
"must": {"term": {"AppID": 5091}},
"must": {"range": {
"RecordTime": {
"gte": 1479312000,
"lt": 1479398400
}}
}
}
},
"aggs": {
"distinct_user": {
"cardinality": {
"script": "doc['Channel'].value + ' ' + doc['AreaServerID'].value + ' ' + doc['RoleID'].value"
}
}
}
}
}
}

result

"by_days": {
"doc_count": 22636,
"distinct_user": {
"value": 3067
}
}

the distinct_user is different, i thought they should be same.

please help, thanks.

Hi @moonnap,

from the documentation "A single-value metrics aggregation that calculates an approximate count of distinct values".
You can find more information in this paragraph of the documentation and on this page.
Might be worth playing with the precision_threshold setting.

Hope this helps.
Jakob

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.