Requesting background info on `search.max_buckets` change

So the default value of search.max_buckets is -1 currently (6.x), which means there is no limit. We've been starting to get warnings that the default value will change to 10,000 in future versions (we currently have a terms aggregation which we've seen return over 27,000 buckets).

Can someone speak a bit, or link me to some info, about why this change is being made? Is it a heap space issue? Will it be possible -- or wise -- to set search.max_buckets to -1 explicitly so as to disable it in future versions?

The reason I ask is this: we can (and probably should) make the particular aggregation I reference above into a composite aggregation, but there are many places in our application where we use a terms aggregation. I don't think we necessarily want to make them all be composite aggregations. We can't, in some cases -- e.g. a composite aggregation can't be a subaggregation of some other top aggregation, so if we want to aggregate on a nested field, we're out of luck. We definitely don't want to risk throwing an exception, and I'm not too keen about giving the end user incorrect results or statistics about those results either.

In a nutshell, I've been thinking about how to identify every place we do a terms aggregation, figure out if it's even conceivable that it could return more than 10,000 buckets, and decide what to do about it in each case, and it's giving me a massive headache.

It seems like it'd be great if there were the option to configure it essentially how it is now -- having a soft limit and logging a warning if you go over it, rather than throwing an exception.

Thanks for any insight you can provide.

This change was introduced in https://github.com/elastic/elasticsearch/pull/27581. There are a variety of situations that led to this, but https://github.com/elastic/elasticsearch/issues/26012 and https://github.com/elastic/elasticsearch/issues/27452 are a few examples. This is a configurable setting (and a dynamic one), so you can override as you like. It still would be wise to convert over as many of the "huge" aggs to composite aggs where you can.

Thanks for your swift reply. I looked at that PR and now have another question. For a terms aggregation, would setting "size" to 10000 guarantee that the TooManyBucketsException would never be thrown (assuming the default value of search.max_buckets)? Or would it still be possible that some internal phase would go over 10000 and throw the exception?

Setting size at the top level of the agg would just define the number of top-level buckets. If you had 1000 top level with 1000 nested under each top level, that would still hit the limit.

That is a good point too. When I said "internal phase" I was thinking more about what happens on the individual shards even for a one-level aggregation, though. I may be betraying my very, very rough understanding of how aggregations on shards work.

Thanks for all your help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.