For the introduction, we are running into a performance issues when multi-level terms aggregation is being applied.
We’re trying to implement multi-level Terms aggregation, up-to 5 levels and running into performance issues, that is related both with the cluster performance and timing issues.
Prior to this, we were aggregating up-to 2 levels, and increasing it to 5 is seeming not to be performance and timing friendly.
The query looks like this :
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"bool": {
"filter": [
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"bool": {
"adjust_pure_negative": true,
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"range": {
"dateFilters": {
"include_lower": true,
"include_upper": true,
"from": "2020-04-01",
"boost": 1,
"to": "2021-03-31"
}
}
}
],
"boost": 1
}
}
],
"boost": 1
}
}
],
"boost": 1
}
}
],
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"must_not": [
{
"term": {
"key1": {
"boost": 1,
"value": ""
}
}
}
],
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"exists": {
"field": "key1",
"boost": 1
}
}
],
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"exists": {
"field": "key2",
"boost": 1
}
}
],
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"exists": {
"field": "key3",
"boost": 1
}
}
],
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"must_not": [
{
"term": {
"key4": {
"boost": 1,
"value": ""
}
}
}
],
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"must": [
{
"exists": {
"field": "key4",
"boost": 1
}
}
],
"boost": 1
}
}
],
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
{
"bool": {
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"must": [
{
"terms": {
"boost": 1,
"CustomType": [
"CustomType"
]
}
}
],
"boost": 1
}
},
"aggregations": {
"Key1": {
"terms": {
"shard_min_doc_count": 0,
"field": "Key1",
"size": 2147483647,
"show_term_doc_count_error": false,
"min_doc_count": 1,
"order": {
"_key": "asc"
}
},
"aggregations": {
"Key2": {
"terms": {
"shard_min_doc_count": 0,
"field": "Key2",
"size": 2147483647,
"show_term_doc_count_error": false,
"min_doc_count": 1,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"Key3": {
"terms": {
"shard_min_doc_count": 0,
"field": "Key3",
"size": 2147483647,
"show_term_doc_count_error": false,
"min_doc_count": 1,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"Key4": {
"terms": {
"shard_min_doc_count": 0,
"field": "Key4",
"size": 2147483647,
"show_term_doc_count_error": false,
"min_doc_count": 1,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"filterAggs": {
"filters": {
"other_bucket": false,
"other_bucket_key": "other",
"filters": [
{
"match_all": {
"boost": 1
}
}
]
},
"aggregations": {
"expenses": {
"extended_stats": {
"sigma": 2,
"field": "expenses"
}
}
}
}
}
}
}
}
}
}
}
},
"totalAggs": {
"filters": {
"other_bucket": false,
"other_bucket_key": "other",
"filters": [
{
"match_all": {
"boost": 1
}
}
]
},
"aggregations": {
"expenses": {
"extended_stats": {
"sigma": 2,
"field": "expenses"
}
}
}
}
},
"track_total_hits": 2147483647
}
What we’re trying to achieve here is that based on Key1-Key2-Key3-Key4 get a certain numerical value related to these keys, along with the total value irrespective of the keys.
For eg: A customer’s expense in a store. Name-State-Phone Number-Gender -> Expense, and the total expense of all the customers.
2 keys were supported at first, and were always applied with size : Max.Integer_Value.
Now, there is a requirement to increase the keys up-to 5. When applied in the same fashion, we’re getting poor cluster performance(this was known to cause poor performance, but we had to roll this feature out anyway, so still persisted in it, hoping for some ways we could improve the performance) and most importantly time-out issues(taking time in minutes, depending on the cardinality of fields selected.
The query is working for key combinations in which all the keys have low cardinalities. However, adding only one high cardinality field would make the performance terrible.
The order in which the keys will be received is not fixed. Suppose in some cases, key1 might have the higher cardinality values, sometimes not, sometimes the last key might.
If some filters are applied then, the query wouldn’t cause time-out issues, however the user has the option of running the query without applying any filters also.
The case we’re currently facing is that for almost 3.9M document counts and above, the query is taking 2-3 minutes to run, however our application time-out is 60 seconds.
Some questions I would like to mention,
A) Is using the size of Integer MAX value causing the performance issues? As I went through some stack overflow answers, most of them certainly did not recommend using MAX Integer value. Will calculating the cardinality of each field and adding some greater constants, say adding 5000 in each cardinality value and applying it in the size of Terms aggregation work? Is it feasible or doesn’t vary much?
Note: I also tried by reducing the size of Terms Aggregation to 5000 in all the levels. Yet, the query was taking around 1.8m for 7M documents.
B) I also went through the BreadthFirst and DepthFirst search, and came to know that there might be combinatorial explosions in certain cases, thus causing poor search performance. When exactly should the BreadthFirst search be used?
It would be helpful to get the explanation from the example above.
For eg: key1 is low cardinality field and has 3 or 4 unique counts, and key2 is high cardinality field having upto 2M distinct count. And key3 and key4 have lower cardinality counts than that of key2.
In which level should the BreadthFirst be implemented so that performance could be improved.
C) I have also tried implementing eager_global_ordinals: TRUE for high cardinality fields and set the cluster refresh interval to 100 seconds. But this isn’t helping either. Shows no timing improvements when compared to the old index without eager_global_ordinals.
Could anyone please help with any sort of information?
Thanks in advance. And sorry for the long write up.