I'm using elasticsearch to index documents and then, with an other document, I score similarity using the "more_like_this" query.
Just two questions:
Does the "more_like_this" query use cosine similarity to score documents (I've read the documentation, but I'm still not sure)?
There is a way to get the scores between 0 and 1?
I have found a fairly old code snippet [1] to calculate the cosine
similarity in lucene, but I was wondering if elasticsearch provided an
easier API to access this information.
[1]
Hi,
I'm using elasticsearch to index documents and then, with an other
document,
I score similarity using the "more_like_this" query.
Just two questions:
Does the "more_like_this" query use cosine similarity to score documents
(I've read the documentation, but I'm still not sure)?
There is a way to get the scores between 0 and 1?
MLT query translates the source document into a normal boolean terms query.
This query is fired against the index and a normal TF IDF is used to find
the score.
AFAIK this is the working of MLT.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.