Hey,
I am running an aggregation only request, that looks like this:
GET product_data/_search?request_cache=false&terminate_after=500000
{
"timeout": "300ms",
"track_total_hits": true,
"size": 0,
"query": {
"bool": {
"filter": [
{
"simple_query_string": {
"query": "buch",
"default_operator": "AND",
"fields": [
"category_names"
]
}
}
// few more filters here
]
}
},
"aggs": {
"category_id": {
"terms": {
"field": "category_id"
}
}
}
}
I terminate after 500k documents to prevent parsing millions of documents. Running this request with track total hits enabled takes roughly 100ms, however when disabling track total hits or leaving it out completely the runtime exceeds the 300ms timeout and takes about 380ms.
Also the responses because it seems that disabled track total hits does not honor terminate after. This is with track total hits set to true
{
"took": 96,
"timed_out": false,
"terminated_early": true,
"_shards": {
"total": 4,
"successful": 4,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2000000,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"categoryId": {
"doc_count_error_upper_bound": 16008,
"sum_other_doc_count": 929766,
"buckets": [
{
"key": "A
"doc_count": 393317
},
{
"key": "B
"doc_count": 247372
},
{
"key": "C
"doc_count": 221628
},
{
"key": "D",
"doc_count": 207917
}
]
}
}
}
This is, when not being set or set to false
{
"took": 424,
"timed_out": false,
"terminated_early": false,
"_shards": {
"total": 4,
"successful": 4,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": null,
"hits": []
},
"aggregations": {
"categoryId": {
"doc_count_error_upper_bound": 79596,
"sum_other_doc_count": 4039869,
"buckets": [
{
"key": "A",
"doc_count": 2200027
},
{
"key": "B",
"doc_count": 1055606
},
{
"key": "C",
"doc_count": 934889
},
{
"key": "D",
"doc_count": 881217
}
]
}
}
}
You can see the counts are clearly exceeding the maximum expected of 2 million. Is there some block max WAND optimization that does not work properly?
This is on Elasticssearch 8.14.1.
Is this a known bug, or anything I can do? On top of my head setting track_total_hits: true
on aggregating queries should not have any negative effect if everything works as expected, but some confirmation would be great.
Thanks for any hints what is happening here.
Have a great week!
--Alex