Term aggregation count too high

I'm using a boolean query with a mix of match, prefix and phrase queries in should, must and filter, and doing a term aggregation over a keyword. The results of the aggregation can be used to add further filters to the query (i.e. a simple faceted search).

As far as I understood aggregations, they should work (per shard) on the result set returned by the query. After going through the "Count is approximate" section of the term aggregation page, I would understand if the doc_count aggregation returned would be too small (since the relevant keyword wouldn't need to be in the buckets returned from a shard). But how can it be too large?

However I have the case where one keyword in the aggregation gives me a doc_count of 2. Adding a filter on this keyword however returns an empty set.

I've tried increasing size as well as the shard_size to a value greater than the number of buckets in total, in order to force an exact count, but the count result persists. doc_count_error_upper_bound is 0 for this aggregation.

Clearly I'm not understanding something about how the term aggregation works. Do aggregations disregard certain query matches or filters when aggregating the results and I'm seeing a case where the 2 counted documents are filtered later on?

You are correct in your assumption that the inaccuracies relate to under-counting never over-counting.
Can you double check the filter that produces zero results? What does that query look like?

After bangig my head against the wall for two hours before posting, I notice on copying out the query text that we have an error in the pagination code and the from parameter was greater than the number of items found in total. After fixing that, everything seems to be working as expected... When you're hip deep in complex stuff thats new to you a trivial bug can completely trip you up. Ah well...

Thank you for confirming my assumptions on the counting of the aggregations though, that helps in another part of the project :).

Some mistakes just want a bigger audience :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.