Unified Highlighter throws too_complex_to_determinize_exception with >288 filter terms

There seems to be a problem with the Unified Highlighter in Lucene 8.6.2 that is affecting ElasticSearch 7.9.1. I already asked the Lucene developers if they knew of the bug and they said it's an ElasticSearch specific implementation problem.

If a search is performed with >288 filter terms using the unified highlighter it throws a too_complex_to_determinize_exception, but if you switch to the plain highlighter it works fine. Alternatively, if you filter on a "copy_to" field instead of the indexed field, it also works.

For the moment we have swapped to using the plain highlighter, but I was hoping there was a different solution.

This throws the error

 {
    "highlight": {
        "type": "unified",
        "fields": {
            "title": {
                "require_field_match": false
            }
        }
    },
    "query": {
        "bool": {
            "must": [{
                "query_string": {
                    "query": "*"
                }
            }],
            "filter": [{
                "bool": {
                    "must": [{
                        "terms": {
                            "id": [ ">288 terms here" ]
                        }
                    }]
                }
            }]
        }
    }
}

But this works fine

 {
    "highlight": {
        "type": "plain",
        "fields": {
            "title": {
                "require_field_match": false
            }
        }
    },
    "query": {
        "bool": {
            "must": [{
                "query_string": {
                    "query": "*"
                }
            }],
            "filter": [{
                "bool": {
                    "must": [{
                        "terms": {
                            "id": [ ">288 terms here" ]
                        }
                    }]
                }
            }]
        }
    }
}

Or if I adjust the search to use the copy_to field it works as well (note "id" is now "_id")

 {
    "highlight": {
        "type": "unified",
        "fields": {
            "title": {
                "require_field_match": false
            }
        }
    },
    "query": {
        "bool": {
            "must": [{
                "query_string": {
                    "query": "*"
                }
            }],
            "filter": [{
                "bool": {
                    "must": [{
                        "terms": {
                            "_id": [ ">288 terms here" ]
                        }
                    }]
                }
            }]
        }
    }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.