There seems to be a problem with the Unified Highlighter in Lucene 8.6.2 that is affecting ElasticSearch 7.9.1. I already asked the Lucene developers if they knew of the bug and they said it's an ElasticSearch specific implementation problem.
If a search is performed with >288 filter terms using the unified highlighter it throws a too_complex_to_determinize_exception
, but if you switch to the plain highlighter it works fine. Alternatively, if you filter on a "copy_to" field instead of the indexed field, it also works.
For the moment we have swapped to using the plain highlighter, but I was hoping there was a different solution.
This throws the error
{
"highlight": {
"type": "unified",
"fields": {
"title": {
"require_field_match": false
}
}
},
"query": {
"bool": {
"must": [{
"query_string": {
"query": "*"
}
}],
"filter": [{
"bool": {
"must": [{
"terms": {
"id": [ ">288 terms here" ]
}
}]
}
}]
}
}
}
But this works fine
{
"highlight": {
"type": "plain",
"fields": {
"title": {
"require_field_match": false
}
}
},
"query": {
"bool": {
"must": [{
"query_string": {
"query": "*"
}
}],
"filter": [{
"bool": {
"must": [{
"terms": {
"id": [ ">288 terms here" ]
}
}]
}
}]
}
}
}
Or if I adjust the search to use the copy_to field it works as well (note "id" is now "_id")
{
"highlight": {
"type": "unified",
"fields": {
"title": {
"require_field_match": false
}
}
},
"query": {
"bool": {
"must": [{
"query_string": {
"query": "*"
}
}],
"filter": [{
"bool": {
"must": [{
"terms": {
"_id": [ ">288 terms here" ]
}
}]
}
}]
}
}
}