Mapping the query to the highlight using FVH

The highlighters are great because they help you figure out why your query matched a document. Unfortunately when they are used in conjuncture with any kind of analyzer that stems the words in the query, it is difficult to figure out WHY the query matches a document. The FVH highlighter and the pre/post tags help a bit, but it is not clear from the docs what the expected behavior of the FVH highlighter is.

I wrote a post a while ago about about the topic but got no discussion going. So I've written a longer deep dive about the subject here.

https://jack-hodkinson.medium.com/reverse-engineering-elasticsearch-highlights-e36ec4164e84

Thanks for the write-up.
One approach for more transparency into what search terms matched might be to use the highlighter that ships with the annotated_text field which is installed as an extra plugin. Although it's designed for use on annotated_text fields it also works with text fields and uses a markdown-like syntax for adding annotations to text e.g. given this doc:

{
	"text": "The new hot technology has emerged!"
}

and this query

{
	"query": {
		"bool": {
			"should": [
				{"match": {"text": "fabulous"}},
				{"match": {"text": "new"}}
			],
			"must": [
				{"match": {"text": "technology"}},
				{"wildcard": {"text": "emerg*"}}
			]
		}
	},
	"highlight": {
		"type": "annotated",
		"fields": {
			"text": {}
		}
	}
}

The marked-up result includes which terms matched a section of text:

        "highlight": {
          "text": [
            "The [new](_hit_term=new) hot [technology](_hit_term=technology) has [emerged](_hit_term=text%3Aemerg*)!"
          ]
        }
2 Likes

That could be a nice solution. I've been meaning to try out annotated text too. I'll give it a go. Thanks!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.