Hey Elastic friends! I recently set up my Elastic cluster with the new 7.0.0 for the purpose of using the Uptime visualizations. I got everything working well, but now when I look at the Uptime tab I receive this error underneath the uptime summary:
Error GraphQL error: [too_many_buckets_exception] Trying to create too many buckets. Must be less than or equal to: [10000] but was [10001]. This limit can be set by changing the [search.max_buckets] cluster level setting., with { max_buckets=10000 }
I came across this thread Requesting background info on `search.max_buckets` change that says this value is configurable, however, I am not sure where to configure this or what effect changing this value might have on the remainder of my cluster. Additionally, aside from having configured ILM, everything is a default setting - so perhaps the defaults should be changed?
I look forward to hearing suggestions for how I can remediate this issue; thanks!
I think that from a scalability standpoint we can treat this as a bug; you shouldn't need to modify your cluster settings in order to view a relatively large number of monitors over an indeterminate series of time.
Can you let me know some things so I can try to make sure we've covered your case? It looks like you're running 7.0 with 26 monitors. What is the selected date range and how frequently are your monitors pinging?
Thank you for your prompt reply, @jkambic! The range is 'Last 1 hour' and my three http monitors are every 10 seconds, the icmp/tcp monitors comprising the remainder are every five seconds.
@phillhocking to give you an update, I created this issue to track the bug. We'll try to address it as soon as we have time, and backport it to the version you're using.
This is excellent @jkambic and I really appreciate all of your hard work on this. Is there something that I can do to remediate this on my cluster while I am waiting for this patch/release to solve the issue?
@phillhocking the error is the result of too much data being selected at a given time; the best way to resolve this temporarily is to select a smaller slice of data. Does a range like now-15m to now cause the error to occur as well? If it does I'd start there and work your way to a wider range until you see the error and treat that as the widest range. I know that's not ideal but it would be a temporary solution.
EDIT: I've opened a PR related to this. We'll try to get it reviewed and merged next week.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.