Weird. Its just how the no_match segmenter works in the plain highlighter. It just grabs text ending at the last token before the end of the text. I wrote this many years ago to simulate how the plain highlighter does segmentation when it finds hits but it looks like its wrong. This is a bug but I don't think it'll be too high on my priority list, sadly:
Yeah, I'm aware. To varying degrees they are able to delegate down to the Lucene bits.
They all implement the process differently. You should try on the fvh, its more likely to work. The postings highlighter isn't going to do what you want unless you feed it complete sentences.
Lucene doesn't have support for no_match_size. Most of the code elasticsearch has for highlighting is really just to adapt the API into Lucene's highlighters. no_match_size is kind of an anomaly in that its trying to implement something without upstreaming it. And I'm not 100% sure why I didn't upstream the change at the time.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.