Highlight_query does treat all clauses and terms as boolean OR

see https://github.com/elastic/elasticsearch/issues/20676

I have the same issue. The documentation says:
"It is also possible to highlight against a query other than the search query by setting highlight_query. This is especially useful if you use a rescore query because those are not taken into account by highlighting by default. Elasticsearch does not validate that highlight_query contains the search query in any way so it is possible to define it so legitimate query results aren’t highlighted at all. Generally it is better to include the search query in the highlight_query. Here is an example of including both the search query and the rescore query in highlight_query."

https://www.elastic.co/guide/en/elasticsearch/reference/5.x/search-request-highlighting.html

It also gives the following example:

"highlight" : {
        "order" : "score",
        "fields" : {
            "content" : {
                "fragment_size" : 150,
                "number_of_fragments" : 3,
                "highlight_query": {
                    "bool": {
                        "must": {
                            "match": {
                                "content": {
                                    "query": "foo bar"
                                }
                            }
                        },
                        "should": {
                            "match_phrase": {
                                "content": {
                                    "query": "foo bar",
                                    "slop": 1,
                                    "boost": 10.0
                                }
                            }
                        },
                        "minimum_should_match": 0
                    }
                }
            }
        }
    }

The currently behaviour treats all terms as OR and highlights anything that it finds no matter if it was defined through bool must, should, query_string AND, |(value1 + value2).

What is the purpose of having a separate query in the first place then when all that the highlighter does is to extract all terms as a list of terms to match? Is there a way to use boolean logic on the highlighter?

Hey,

you configured the match query to be an OR query - so this is expected behaviour (you can change this of course in the match query). As mentioned in the documentation, a highlight query makes sense, when you use a fast query to get matching documents and a slower query to refine the results by using the rescore query. Then using that slower query in the highlight_query makes a lot of sense.

hope that helps!

--Alex

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.