Terms aggregation: how many is too many?

In the documentation for the terms aggregation it is stated that to retrieve all terms you'd normally go for a composite aggregation and use paging.

Since for our current use-case having a one-shot query would be much easier, I'm wondering if there is any rule of thumb for how many terms are too many.

We at the moment have around 600 terms, don't anticipate going about 1000 in the near future, and the query is speedy (<150ms).

I realize there are many variables to account for, and if you tell me which numbers you need I would be happy to provide them :slight_smile:

Alternatively, can you suggest some metrics to keep an eye on? I mean, what would the main concern be with terms aggregation of large size? Memory usage?

By default the maximum number of total buckets in an aggregation response is limited to 10000. See https://www.elastic.co/guide/en/elasticsearch/reference/7.6/search-aggregations-bucket.html

That might be one of the limits you want to monitor on top of the memory requirements when running Elasticsearch (and considerung that queries can come in in parallel).

Hope that helps!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.