I use a query with function_score. The first part is the search in the text (giving a score) and THEN a script is applied to compute cosine similarity.
My problem is that the cosine similarity is not computed during the query phase and my search in the text act as a pre-filter. I will always obtain results linked with the text search even if the cosine similarity is better.
This is the standard behavior of function_score according the doc:
The function_score allows you to modify the score of documents that are retrieved by a query. This can be useful if, for example, a score function is computationally expensive and it is sufficient to compute the score on a filtered set of documents.
I would like to compute the cosine similarity at query time and this score will be combined with the text search (with as much importance).
Thanks !
You will find a gist here describing the problem with a "real" example.
Hi there!
Did I understand your intention correctly: you want to go through all documents, and apply cosine similarity function to them. Then you also have a query and for the documents that match a query, you want to calculate score for this query. Then you want to combine these two scores: from a cosine similarity and a query?
Currently, you can do that with a bool query using should clauses like this:
This will give you a sum of scores: score1 + score2. You can also apply boost for any query.
We also have a plan to develop a compound query that will give you an option to combine scores of queries not only through sum option. But this is not available yet.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.