Match phrase queries to highlighted values


(Alexandra) #1

Hi,

My use case is the following : I have a text segment and a list o terms and I want to find only the ones that match exactly the text.
For e.g. if my segment text is : "This is a simple text." And my terms are : "texts", "this", "text", I will find and highlight only the terms "this" and "text".

I'm building the query with the Java Api like this ( the segment is indexed ):

BoolQueryBuilder query = QueryBuilders.boolQuery();
for(TermDocument termCandidate : termCandidates) {
query.should(QueryBuilders.matchPhraseQuery(ElasticsearchDocumentField.TEXT_CONTENT.getName(), termCandidate.getTermText()).slop(0).queryName(termCandidate.getId()).analyzer(EN_ANALYZER));
}

If I also highlight the terms ( because in the end I need the offsets ), the will all be highlighted and I don't know which one is which. (e.g. This is a simple text.)

So now, my questions :
1.Is there a way to highlight the terms from the query separately ? And to associate some id to each of them in order to be able to match them back ?
2. Is there a way to receive the token numbers for an indexed text without using the analyze api ? (this is unrelated to the first question).


(system) #2