Elasticsearching *Highlighting* feature breaks highlights by words, not phrases

ElasticSearch v7.5

Hello and good day!

I'm using ElasticSearch's feature called Highlighting . ES is able to emphasize the keywords I'm using in my query_string query, sample query:

GET socialmedia/_search
{
  "size": 5,
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "fields": ["title" , "content"],
            "query": "World War III"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "content" : {
        "type" : "unified"
      }
    }
  }
}

RESPONSE (omitted some of irrelevant fields for this question's purpose)

"highlight" : {
      "content" : [
        "<em>World</em> <em>War</em> <em>III</em>"
      ]
}

As you can see, the output breaks the HTML tags by words, and not by the whole PHRASE . My desired output should look like this:

<em> World War III </em>

Am I missing anything in this Highlighting feature?

Technically, your search is for World OR War OR III not a phrase query.

That said, there's an open issue tracking phrase highlighting

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.