Highlighting not appearing sometimes


(Eric Mill) #1

Is there any reason why a search result might not include highlighting for a
given query, when highlighting is requested, and there's a match?

I'm doing a "text" query of type "phrase", and for 4 out of the 5 hits in
the response, there's a highlight section returned, with proper
highlighting. On one of them, there's no highlighting at all, but when I
look at the field on the source document, I can see at least three instances
of the query (a single word) in the field.

-- Eric


(ofavre) #2

Make sure you have term_vector=with_positions_offsets in your mapping for
the relevant fields.
Otherwise, ES will reanalyze the field and try to match the term searched.
It can miss many occurrences if some stemming is performed.

--
Olivier Favre

2011/6/14 Eric Mill kprojection@gmail.com

Is there any reason why a search result might not include highlighting for
a given query, when highlighting is requested, and there's a match?

I'm doing a "text" query of type "phrase", and for 4 out of the 5 hits in
the response, there's a highlight section returned, with proper
highlighting. On one of them, there's no highlighting at all, but when I
look at the field on the source document, I can see at least three instances
of the query (a single word) in the field.

-- Eric


(ofavre) #3

Hey, I'm hitting the same bug. I thought it was fixed by what I suggested in
my previous mail, but I got wrong somehow...

I propose the following patch for it. See attached file.
If it seems acceptable, I'll be glad to do a pull-request for:
https://github.com/ofavre/elasticsearch/commit/70509e431d03784b8d1957151630b0a784e021f5
(master)


(0.17)

By the way, I have a question for the Lucene community. (In absence of
answer from kimchy, I'll post to Lucene's dev mailing-list).
Why does FieldQuery rewrites queries extensively? (With potential recursion
problems)
Why doesn't they use extractTerms() after one rewrite (which can eliminate
terms from "must_not" clauses from BooleanQuerys etc.)?
Afterall, (in the fast highlighter), it's the matched terms that we are
highlighting...

Best regards,

Olivier Favre

2011/7/13 Olivier Favre olivier@yakaz.com

Make sure you have term_vector=with_positions_offsets in your mapping for
the relevant fields.
Otherwise, ES will reanalyze the field and try to match the term searched.
It can miss many occurrences if some stemming is performed.

--
Olivier Favre

www.yakaz.com

2011/6/14 Eric Mill kprojection@gmail.com

Is there any reason why a search result might not include highlighting for
a given query, when highlighting is requested, and there's a match?

I'm doing a "text" query of type "phrase", and for 4 out of the 5 hits in
the response, there's a highlight section returned, with proper
highlighting. On one of them, there's no highlighting at all, but when I
look at the field on the source document, I can see at least three instances
of the query (a single word) in the field.

-- Eric


(Shay Banon) #4

I need to dive again into the highlighter impl to check, maybe you can ask
on the lucene mailing list as well...

On Thu, Jul 21, 2011 at 7:58 PM, Olivier Favre olivier@yakaz.com wrote:

Hey, I'm hitting the same bug. I thought it was fixed by what I suggested
in my previous mail, but I got wrong somehow...

I propose the following patch for it. See attached file.
If it seems acceptable, I'll be glad to do a pull-request for:

https://github.com/ofavre/elasticsearch/commit/70509e431d03784b8d1957151630b0a784e021f5
(master)

https://github.com/ofavre/elasticsearch/commit/3ba37163d42a53f30b6a602ca4a228d86a93e3ab
(0.17)

By the way, I have a question for the Lucene community. (In absence of
answer from kimchy, I'll post to Lucene's dev mailing-list).
Why does FieldQuery rewrites queries extensively? (With potential recursion
problems)
Why doesn't they use extractTerms() after one rewrite (which can eliminate
terms from "must_not" clauses from BooleanQuerys etc.)?
Afterall, (in the fast highlighter), it's the matched terms that we are
highlighting...

Best regards,

Olivier Favre

www.yakaz.com

2011/7/13 Olivier Favre olivier@yakaz.com

Make sure you have term_vector=with_positions_offsets in your mapping for
the relevant fields.
Otherwise, ES will reanalyze the field and try to match the term searched.
It can miss many occurrences if some stemming is performed.

--
Olivier Favre

www.yakaz.com

2011/6/14 Eric Mill kprojection@gmail.com

Is there any reason why a search result might not include highlighting
for a given query, when highlighting is requested, and there's a match?

I'm doing a "text" query of type "phrase", and for 4 out of the 5 hits in
the response, there's a highlight section returned, with proper
highlighting. On one of them, there's no highlighting at all, but when I
look at the field on the source document, I can see at least three instances
of the query (a single word) in the field.

-- Eric


(ofavre) #5

Here is the JIRA issue I've initiated:
https://issues.apache.org/jira/browse/LUCENE-3332?focusedCommentId=13069565

--
Olivier Favre

2011/7/22 Shay Banon shay.banon@elasticsearch.com

I need to dive again into the highlighter impl to check, maybe you can ask
on the lucene mailing list as well...

On Thu, Jul 21, 2011 at 7:58 PM, Olivier Favre olivier@yakaz.com wrote:

Hey, I'm hitting the same bug. I thought it was fixed by what I suggested
in my previous mail, but I got wrong somehow...

I propose the following patch for it. See attached file.
If it seems acceptable, I'll be glad to do a pull-request for:

https://github.com/ofavre/elasticsearch/commit/70509e431d03784b8d1957151630b0a784e021f5
(master)

https://github.com/ofavre/elasticsearch/commit/3ba37163d42a53f30b6a602ca4a228d86a93e3ab
(0.17)

By the way, I have a question for the Lucene community. (In absence of
answer from kimchy, I'll post to Lucene's dev mailing-list).
Why does FieldQuery rewrites queries extensively? (With potential
recursion problems)
Why doesn't they use extractTerms() after one rewrite (which can eliminate
terms from "must_not" clauses from BooleanQuerys etc.)?
Afterall, (in the fast highlighter), it's the matched terms that we are
highlighting...

Best regards,

Olivier Favre

www.yakaz.com

2011/7/13 Olivier Favre olivier@yakaz.com

Make sure you have term_vector=with_positions_offsets in your mapping
for the relevant fields.
Otherwise, ES will reanalyze the field and try to match the term
searched.
It can miss many occurrences if some stemming is performed.

--
Olivier Favre

www.yakaz.com

2011/6/14 Eric Mill kprojection@gmail.com

Is there any reason why a search result might not include highlighting
for a given query, when highlighting is requested, and there's a match?

I'm doing a "text" query of type "phrase", and for 4 out of the 5 hits
in the response, there's a highlight section returned, with proper
highlighting. On one of them, there's no highlighting at all, but when I
look at the field on the source document, I can see at least three instances
of the query (a single word) in the field.

-- Eric


(system) #6