Elasticsearch - EdgeNgram + highlight + term_vector = bad highlights

Sebastien_Lorber · July 6, 2012, 1:40pm

Hello,

I've originally posted this question to StackOverflow but nobody answers
so...

When i use an analyzer with edgengram (min=3, max=7, front) +
term_vector=with_positions_offsets

With document having text = "CouchDB"

When i search for "couc"

My highlight is on "cou" and not "couc"

It seems my highlight is only on the minimum matching token "cou" while i
would expect to be on the whole word or at least the longest token found.

It works fine without analyzing the text with
term_vector=with_positions_offsets -> the highlight is on "couc" and not
"cou"

What's the impact of removing the term_vector=with_positions_offsets for
perfomances?

Thanks

Topic		Replies	Views
Edge Ngram gives bad highlight when using position offsets Elasticsearch	4	2256	July 6, 2017
Highlighting not working for [edge_]ngram with the new versions Elasticsearch	3	1067	July 6, 2017
How to highlight partial word when using edge_ngram filter Elasticsearch	1	345	February 18, 2021
For large texts, indexing with offsets or term vectors is recommended Elasticsearch	3	5047	March 31, 2021
Adding term vectors causes span queries to not work? Elasticsearch	3	626	July 5, 2017