Hi,
I'm trying to run the 'More Like This' (MLT) query using the apache spark connector.
The problem is that the result is not sorted by computed MLT score. I think it is related to the sort=_doc
parameter added in the query builder.
the code is the following:
val localSpark = SparkSession
.builder()
.appName("teste")
.config("spark.es.nodes", "localhost")
.config("spark.es.port", "9200")
.config("es.mapping.id", "id")
.config("es.write.operation", "upsert")
.config("spark.es.nodes.wan.only", "true")
.config("es.scroll.size", 15)
.master("local").getOrCreate()
val query = """{"query" : {"more_like_this": { "fields": ["text"], "like": [{"_index": "documents", "_id": "1234"}]}}}"""
val df = localSpark.read.format("org.elasticsearch.spark.sql").option("query", query).option("pushdown", "true").load("documents")
setup:
- java: 1.8
- spark : 3.1.0
- scala: 2.12.12
- "elasticsearch-spark-30" % "8.2.2"