Hi. We are porting an application to Elastic search. We have both the plain text and an html rendering of our documents available. We want to add the document to ES, then search and show highlights in the HTML to the user. What is the best way to go about achieving this? We want to retrieve the html for the entire document with highlights. I have read and played with ES using the Sense plugin, and am able to get basic highlighting to work. I am looking for any recommendations and best practices, etc. from someone who has already accomplished this.
Do we need to add both the plain text and the html to the index?
Don't we have to search the HTML to get back highlights in the html? If we search on it, then the HTML has to be indexable right?
If searching the html, can html tags be ignored?
What about highlighting a phrase across html tags?