ES aggregation query

niraj_pandey · October 26, 2020, 9:36am

I am running a huge aggregation query and getting the following error.

This aggregation creates too many buckets (10001) and will throw an error in future versions. You should update the [search.max_buckets] cluster setting or use the [composite] aggregation to paginate all buckets in multiple requests.

This is my query.

{ "aggs": { "projectname": { "terms": { "field": "project.keyword",  "order": { "_count": "desc" } }, 
"aggs": { "username": { "terms": { "field": "user.keyword",  "order": { "_count": "desc" } }, "aggs": { "currdir": { "terms": { "field": "CWD.keyword",  "order": { "_count": "desc" } }, 
"aggs": { "reqmem": { "terms": { "field": "reqmem",  "order": { "_count": "desc" } },
 "aggs": { "reqres": { "terms": { "field": "reqres.keyword",  "order": { "_count": "desc" } }, "aggs": { "noproc": { "max": { "field": "no_proc" } }, "mm": { "max": { "field": "max_mem" } }, "avgmem": { "avg": { "field": "max_mem" } }, "rt": { "max": { "field": "run_time" } }, "avgrt": { "avg": { "field": "run_time" } }, "pcm": { "max": { "field": "per_core_memory" } }, 
"avgpcm": { "avg": { "field": "per_core_memory" } }, "ptime": { "max": { "field": "pend_time" } }, "avgptime": { "avg": { "field": "pend_time" } },
 "cputime": { "max": { "field": "ru_utime" } }, "avgcputime": { "avg": { "field": "ru_utime" } } } } } } } } } } } } }, "query": { "bool": { "must": [ { "match_all": {} }, { "match_phrase": { "cluster": { "query": "abc01" } } }, { "match_phrase": { "queue": { "query": "cxx64" } } }, { "range": { "@timestamp": { "gte": "2020-09-01T00:00:00", "lte": "2020-09-30T23:59:59" } } } ] } }

Our ELK admin is not allowing to update the "search.max_buckets" value
Any idea how to fix this ?

Hendrik_Muhs · October 26, 2020, 10:18am

Can you provide some more details?

elasticsearch version?
according to your query you need project * user * cwd * reqmem * reqres buckets, I guess that's way more than 10k, do you have an idea how many buckets this requires? Afaik aggs stop as soon as they overflow, therefore its more than 10001
how often do you intend to run this query?
what do you intend to do with the result?

As the error message says, use a composite aggregation. If you want to do further analysis based on the output of the query, you should consider transform, which is basically a composite aggregation that stores the result as documents. Your query lets me think, you want to have monthly buckets in addition to the groupings.

niraj_pandey · October 26, 2020, 11:01am

Thanks Hendrik.
Here are the details:

1- Elastic version : 6.2.4
2- No of buckets: ~40k
3- Frequency to run the query: 1-2 times in a week
4- Collect the data and analyze the workliad

Hendrik_Muhs · October 26, 2020, 11:26am

In this case composite aggregation is your best option.

With new versions this might get easier:

transform >= 7.5
search.max_buckets default to 65k >= 7.9

niraj_pandey · October 26, 2020, 12:05pm

Can you guide me how to implement composite aggregation.

Thanks

system · November 23, 2020, 12:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is there a way to batch your query when facing with too_many_buckets_exception Elasticsearch	5	977	February 5, 2020
Aggregation exception Elasticsearch	5	5540	April 8, 2019
Term aggregation size well with in search.max_buckets, still getting too-many-bucket errors Elasticsearch	1	367	November 25, 2020
Search.max_buckets limit error on 7.6.0 even after setting to 10000 Elasticsearch	1	364	January 19, 2022
Aggregations Elasticsearch	7	498	July 6, 2017

ES aggregation query

Related topics