Getting all words used in a document matching a stemmed query


(Sspi) #1

My scenario :

  • partial full-text view is extracted from documents and sent to ES for indexing
  • each document is viewable by final users in different html formats (these formats are not know by ES)

My need : To ask ES to get all words used in a document that match a stemmed query in order to highlight the terms in the viewable formats (by an external script).

Example : I need ES to return to me ["skies", "ski", "skiing"] when I ask it to get all derived words from "sky" that are present in full-text field in document _id="1".

I have looked at "highlight" (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-highlighting.html) and "termvectors" (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-termvectors.html) services, but none match exactly my needs.

Have I missed something ?
If I need to implement my own service, do you think there is a preferred way ?


(system) #2