My current understanding regarding cardinality queries is that the best performance you can get is via precomputing hashes of values and indexing them along with or instead of the original field value.
Aggregations then use these precomputed hashes to build a HyperLogLog+ data structure on the fly and ultimately return a cardinality result.
Although HLLs are memory efficient, they do take a non-trivial amount of time to build for a field with lots of values.
I'm wondering why, at index creation time, we can't specify in the mapping that a field's purpose is for aggregation queries only and thus Elasticsearch can simply build and store a serialized HLL+ in the document instead of just precomputed hashes for the field?
Lemme know if I've misunderstood anything.