Highlighting uses DefaultEncoder, leading to incorrect HTML escaping


(Serge Nekoval) #1

I noticed that HighlightPhase.java hardcodes the use of
DefaultEncoder, which does not escape the text before adding highlight
tags.
I know that Lucene supports HTML encoding, so is there a chance to
support it in ElasticSearch?

The problem is critical for highlighting plain-text fields. If you
escape a field after highlighting (on client), highlight tags will be
escaped as well. Note that all user-provided content HAS to be escaped
if you display it inside HTML page.

Am I the only one experiencing this problem?


(Cory) #2

This is a problem for me also. Did you find a workaround?

-Cory

On Jan 28, 7:47 am, Serge Nekoval neko...@gmail.com wrote:

I noticed that HighlightPhase.java hardcodes the use ofDefaultEncoder, which does not escape the text before adding highlight
tags.
I know that Lucene supports HTML encoding, so is there a chance to
support it in ElasticSearch?

The problem is critical for highlighting plain-text fields. If you
escape a field after highlighting (on client), highlight tags will be
escaped as well. Note that all user-provided content HAS to be escaped
if you display it inside HTML page.

Am I the only one experiencing this problem?


(system) #3