I'm trying to run the 'More Like This' (MLT) query using the apache spark connector.
The problem is that the result is not sorted by computed MLT score. I think it is related to the sort=_doc parameter added in the query builder.
I was able to reproduce this problem. The default sort of _doc makes sense, since that is the most efficient way for a scroll to pull back data. But I thought that maybe adding a sort field to it like this would work:
Unfortunately it looks like that sort is silently ignored and the results are still ordered by _doc.. It looks like a bug. You can probably sort the results by _score on the spark side, but that is not going to perform as well if you have a very large amount of data.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.