hey, I have a query that does this. It's a bunch of terms aggregations. (group by = terms)
filter by date from year 2015 to 2020
group by region
group by date from year 2015 to 2020
group by country
group by date from year 2015 to 2020
group by city
group by date from year 2015 to 2020
group by sales name
group by date from year 2015 to 2020
the output looks like this. These are the sales broken down by region, country, person and year :
Region Country City Name 2O15 2016 2017 2018 2019 2020
America 4000 8000 4000 6000 2000 4000
United States 2000 4000 2000 3000 1000 2000
New York 1000 2000 1000 1500 500 1000
James 500 1000 500 750 250 500
John 500 1000 500 750 250 500
Los Angeles 1000 2000 1000 1500 500 1000
James 500 1000 500 750 250 500
John 500 1000 500 750 250 500
Mexico 2000 4000 2000 3000 1000 2000
Mexico 1000 2000 1000 1500 500 1000
James 500 1000 500 750 250 500
John 500 1000 500 750 250 500
guadalajara 1000 2000 1000 1500 500 1000
James 500 1000 500 750 250 500
John 500 1000 500 750 250 500
Problem : I hit a max_bucket limit because the limit is 10,000, it turns out I need 60,000 buckets.
Details :
- this requirement is non negotiable. It has to be returned like this
- I tried to limit requirements to only load one level at a time, but we have a download button where we have to retrieve the whole data anyway.
- it's not cachable unfortunately, because this report depends on p parameters that can each have n values. caching it would mean we cache more than 1000 variations.
So right now I see two options :
- increase max_buckets to 60000.
- query the tree step by step in an msearch.
Here are the pros and cons for each options
- 1 big query that returns 60,000 buckets:
- Pro :
- simpler
- Con :
- higher payload
- Pro :
- 6 small queries that each return 10,000 buckets ?
- Pro :
- smaller payload because less buckets
- Con :
- The backend will have to rebuild the tree from each msearch result of 10,000 buckets, and I am wondering if Elasticsearch is not better at doing this on its own.
- More complex
- Pro :
- Do you have a pro / con to add ?
- Which option do you recommend ?
- Do you recommend other options ?