Limiting the number of documents for each bucket in term aggregation

Hi,

There is a set of terms. I want to know which ones are present in the search results and I use term aggregation for this purpose.

Is there some way to limit the number of documents for each bucket in term aggregation in order to improve performance?
All I want to know is if there is at least one document containing term. I don't need to count the number of documents for each term.

Or maybe there is another more efficient way to solve this problem?
Thank you.

Couldnt this be simply a search/count operation in combination with terminate_after to speed it up for each term? See Search your data | Elasticsearch Guide [7.14] | Elastic

What's the business question you're trying to answer with this request?

If your goal is to count the number of unique terms, use the cardinality aggregation.

If you want the most popular terms from a large set of unique terms - just faster - then skipping counting some terms may just lead to inaccuracies as to what are the correct subset of terms to select for the final result.

If the number of unique terms in the index is small (so you always return the full set rather than just the top N) we'd need to terminate the search's collection of docs only after a defined number of terms had been discovered - but there's no way for you to provide what that expected number is (or perhaps for you to know what to expect). There's no way for us to know that there's not an extra term to be found at the end of the very long stream of docs that match a query so it's hard to add a shortcut.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.