I am trying to write a plugin that scores documents based on their similarity to search query (to use Elasticsearch as a translation memory database engine). I am aware of elasticsearch-entity-resolution-plugin, but it does not use analyzing.
I expect that all queries I will need will be match and multi_match with just one text. In the scoring function I need access to the results of analysis of the query for all fields I'm searching (including relative positions of all tokens) and data from the same fields in the document being scored. In fact, I don't even need the tokens themselves, only knowledge of which of them are the same.
Is it at all possible to get this data and if it is, how can I do it? I only managed to get tokens of the document being scored, and they seem unordered.