Any chance to get a zero percent of error in Cardinality Aggregation?

Elasticsearch claims to implement a none deterministic algorithm called Hyperlog++ [1] with a good tradeoff between accuracy and peformance. The official guide bold that for low-cardinality sets there are an excellent accuracy. My quetion is, Is it possible to get an exact accuracy ?

In my use case ES holds an index with about of 4M of documents, but this documents can be sliced in smaller chunks that roughly get 10K documents per slice. The slice or the documents belonging to it can identified through the slice_id field.

All queries performed to ES use a filter stage to choose the slice_id, it means that each query will involve at maximum 10K documents. Worth mentioning that all documents belonging to the same slice are routed with the same key, therefore all documents are placed to the same shard. Hereby a set of aggregation stages are added to this query to get some information where some of them use the cardinality aggregation to perform a distinct function.

This cardinality of the integer field in this slice is approximately over 2/4k values. But the aggregation stage is usually performed with a previous filter stage that reduce the cardinality to a few hundreds.

How the accuracy will behave in this scenario ? Can we expect a exact value ? If not, what can I do ?

Cheers,

[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html

I had a bit of time to check it by my self, as I can see the accuracy of the cardinality aggregation stage for small sets of values is not negligible. Getting errors from 5% to 10%, that is far away to get an exact value. Growing the threshold value till 1000 I was able to get an exact value for all of the cardinalities, have in mind that we are talking about cardinalities below than 2k different values.

Im wondering if I can guarantee this accuracy with this threshold over a field that has at maximum 2k different integer values, how can I prove it ? The documentation only talks about the memory space used for - in our case 1000 * 8 bytes that gets less than 10K. My real question is about the implementation by it self, how the hashes are mapped to this structure and how an integer value behaves in that. The common sense says that in the best case scenario with an integer field that has 2k variability will be enough 2000 * 4 bytes, obviously the hyperlog algorithm is not adhoc and the best case scenario cant be considered, the what should I expect ?

Cheers,