Hi,
here https://gist.github.com/1222046 is a bug reconstruction for
highlighting issues with fast-vector-highlighter in cases of proximity
search and exact phrases searching (when exact phrases contain stop
words).
Reconstruction shows two cases where default/plain highlighter works
correctly/better then the fast-vector-highlighter (maybe there are
some cases/queries where situation is reversed), so my question is: if
these bugs in fast-vector-highlighter code require some time to be
fixed maybe it would be useful to expose highlighter (as a quick fix),
so in the case where field is stored (with "term_vector" :
"with_positions_offsets") users can chose between using
fast-vector-highlighter and default/plain highlighter. Because, even
if fast-vector-highlighter is much faster and should be used for
highlighting matches in fields with term vectors stored, in some cases
(like proximity and stop-word phrases) it does not work correctly, so
being able to use plain highlighter would help.
Reconstruction shows two cases where default/plain highlighter works
correctly/better then the fast-vector-highlighter (maybe there are
some cases/queries where situation is reversed), so my question is: if
these bugs in fast-vector-highlighter code require some time to be
fixed maybe it would be useful to expose highlighter (as a quick fix),
so in the case where field is stored (with "term_vector" :
"with_positions_offsets") users can chose between using
fast-vector-highlighter and default/plain highlighter. Because, even
if fast-vector-highlighter is much faster and should be used for
highlighting matches in fields with term vectors stored, in some cases
(like proximity and stop-word phrases) it does not work correctly, so
being able to use plain highlighter would help.
I need to chase it down with the actual implementation of the fast vector
highlighter, not too difficult, just some time consuming, can you open an
issue and we can chase it down there?
On Fri, Sep 16, 2011 at 3:22 PM, Tomislav Poljak tpoljak@gmail.com wrote:
Reconstruction shows two cases where default/plain highlighter works
correctly/better then the fast-vector-highlighter (maybe there are
some cases/queries where situation is reversed), so my question is: if
these bugs in fast-vector-highlighter code require some time to be
fixed maybe it would be useful to expose highlighter (as a quick fix),
so in the case where field is stored (with "term_vector" :
"with_positions_offsets") users can chose between using
fast-vector-highlighter and default/plain highlighter. Because, even
if fast-vector-highlighter is much faster and should be used for
highlighting matches in fields with term vectors stored, in some cases
(like proximity and stop-word phrases) it does not work correctly, so
being able to use plain highlighter would help.
On Tuesday, May 29, 2012 3:43:47 PM UTC-4, kimchy wrote:
I need to chase it down with the actual implementation of the fast vector
highlighter, not too difficult, just some time consuming, can you open an
issue and we can chase it down there?
On Fri, Sep 16, 2011 at 3:22 PM, Tomislav Poljak wrote:
Reconstruction shows two cases where default/plain highlighter works
correctly/better then the fast-vector-highlighter (maybe there are
some cases/queries where situation is reversed), so my question is: if
these bugs in fast-vector-highlighter code require some time to be
fixed maybe it would be useful to expose highlighter (as a quick fix),
so in the case where field is stored (with "term_vector" :
"with_positions_offsets") users can chose between using
fast-vector-highlighter and default/plain highlighter. Because, even
if fast-vector-highlighter is much faster and should be used for
highlighting matches in fields with term vectors stored, in some cases
(like proximity and stop-word phrases) it does not work correctly, so
being able to use plain highlighter would help.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.