I have a task. Given document I have to find the set of similar
The document has title and content fields. I'd like to use some kind
of cosine similarity.
My current approach is to represent input document as boolean query
constructed for each term with OR conjunction.
So there are some shortcomings
- query is too large (thus I expect reduce in performance)
- I have to use query_string query type so I need to use my own query
parser (I merge all terms from all the fields in one query and boost
the terms that belongs to the title)
My questions are what is the best way to solve this task? Is the
elastic search/lucene good for that kind of searching?
I would be much obliged,
ps what is "MoreLikeThis" function? Are there any description how it