I'm using a boolean query with a mix of match, prefix and phrase queries in should, must and filter, and doing a term aggregation over a keyword. The results of the aggregation can be used to add further filters to the query (i.e. a simple faceted search).
As far as I understood aggregations, they should work (per shard) on the result set returned by the query. After going through the "Count is approximate" section of the term aggregation page, I would understand if the doc_count aggregation returned would be too small (since the relevant keyword wouldn't need to be in the buckets returned from a shard). But how can it be too large?
However I have the case where one keyword in the aggregation gives me a doc_count of 2. Adding a filter on this keyword however returns an empty set.
I've tried increasing size as well as the shard_size to a value greater than the number of buckets in total, in order to force an exact count, but the count result persists. doc_count_error_upper_bound is 0 for this aggregation.
Clearly I'm not understanding something about how the term aggregation works. Do aggregations disregard certain query matches or filters when aggregating the results and I'm seeing a case where the 2 counted documents are filtered later on?