So the default value of search.max_buckets
is -1 currently (6.x), which means there is no limit. We've been starting to get warnings that the default value will change to 10,000 in future versions (we currently have a terms aggregation which we've seen return over 27,000 buckets).
Can someone speak a bit, or link me to some info, about why this change is being made? Is it a heap space issue? Will it be possible -- or wise -- to set search.max_buckets
to -1 explicitly so as to disable it in future versions?
The reason I ask is this: we can (and probably should) make the particular aggregation I reference above into a composite aggregation, but there are many places in our application where we use a terms aggregation. I don't think we necessarily want to make them all be composite aggregations. We can't, in some cases -- e.g. a composite aggregation can't be a subaggregation of some other top aggregation, so if we want to aggregate on a nested field, we're out of luck. We definitely don't want to risk throwing an exception, and I'm not too keen about giving the end user incorrect results or statistics about those results either.
In a nutshell, I've been thinking about how to identify every place we do a terms aggregation, figure out if it's even conceivable that it could return more than 10,000 buckets, and decide what to do about it in each case, and it's giving me a massive headache.
It seems like it'd be great if there were the option to configure it essentially how it is now -- having a soft limit and logging a warning if you go over it, rather than throwing an exception.
Thanks for any insight you can provide.