ElasticSearch Java Client FunctionScoreQuery score mismatch with Console score

mantegna · August 18, 2024, 11:48am

Consider I have the following function score query with an idea to find all animals by tags, and score them by mustHaveTags occurances:

POST /animals/_search
{
  "size": 10,
  "query": {
    "function_score": {
      "query": {
        "bool": {
          "filter": [
            {
              "terms": {
                "tags": ["Monkey", "Lion"]
              }
            }
          ]
        }
      },
      "functions": [
        {
          "filter": {
            "term": {
              "mustHaveTags.keyword": {"value": "Monkey"}
            }
          },
          "weight": 1
        },
        {
          "filter": {
            "term": {
              "mustHaveTags.keyword": {"value": "Lion"}
            }
          },
          "weight": 1
        }
      ],
      "score_mode": "sum",
      "boost_mode": "sum"
    }
  }
}

That returns me something like this when executing from Kibana Console:

{
  "hits": {
    "total": {
      "value": 84,
      "relation": "eq"
    },
    "max_score": 2, <- !!!
    "hits": [...]
  }
}

There are some hits with score = 2 that apply to all animals that have all mustHaveTags = ["Lion", "Monkey"]

However, when migrating this query to the native Elasticsearch Java Client (I'm using Kotlin, but I don't think it matters):

val functionScoreQuery = QueryBuilders.functionScore {
    it
        .query(
            QueryBuilders.bool { b ->
                b.filter { fb ->
                    fb.terms {
                        TermsQuery.Builder()
                            .field("tags")
                            .terms { tm -> tm.value(tags.map { tag -> FieldValue.of(tag) }) }
                    }
                }
            }
        )
        .functions(tags.map { tag ->
            FunctionScore.Builder()
                .filter(QueryBuilders.term { t -> t.field("mustHaveTags.keyword").value(tag) })
                .weight(1.0)
                .build()
        })
        .boostMode(FunctionBoostMode.Sum)
        .scoreMode(FunctionScoreMode.Sum)
}


val searchResponse: SearchResponse<Animal> = client.search(
    SearchRequest.of { sr ->
        sr.index("animals")
            .query(functionScoreQuery)
            .size(30)
    }, Animal::class.java
)

That produces the following (similar!) query toString() in the debugger:

{
  "query": {
    "function_score": {
      "boost_mode": "sum",
      "functions": [
        {
          "filter": {
            "term": {
              "mustHaveTags.keyword": {
                "value": "Monkey"
              }
            }
          },
          "weight": 1
        },
        {
          "filter": {
            "term": {
              "mustHaveTags.keyword": {
                "value": "Lion"
              }
            }
          },
          "weight": 1
        }
      ],
      "query": {
        "bool": {
          "filter": [
            {
              "terms": {
                "tags": [
                  "Lion",
                  "Monkey"
                ]
              }
            }
          ]
        }
      },
      "score_mode": "sum"
    }
  }
}

Returns the same amount of hits but with score = 1 at max. and doesn't give any more scores to the documents that have more than 1 mustHaveTags occurrences. Any ideas why?

ltrotta · August 20, 2024, 9:58am

Hello and welcome! I tried reproducing this with the latest version of the java client 8.15.0, here is the query I wrote:

Query query = Query.of(q -> q.functionScore(f -> f
    .scoreMode(FunctionScoreMode.Sum)
    .boostMode(FunctionBoostMode.Sum)
    .query(qq -> qq
        .bool(b -> b
            .filter(ff -> ff
                .terms(t -> t
                    .field("tags.keyword") // had to use keyword here because of different documents input probably
                    .terms(tt -> tt.value(List.of(FieldValue.of("Monkey"),FieldValue.of("Lion"))))))))
    .functions(List.of(
        FunctionScore.of(fs -> fs
            .weight(1D)
            .filter(fi -> fi.term(tm -> tm.field("mustHaveTags.keyword").value("Monkey")))),
        FunctionScore.of(fs -> fs
            .weight(1D)
            .filter(fi -> fi.term(tm -> tm.field("mustHaveTags.keyword").value("Lion"))))))));

// prints to json
System.out.println(JsonpUtils.toJsonString(query, esClient._jsonpMapper()));

esClient.search(s -> s.query(query),Object.class);

the result, converted to json, is the following:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "failed": 0,
    "successful": 21,
    "total": 21,
    "skipped": 0
  },
  "hits": {
    "total": {
      "relation": "eq",
      "value": 3
    },
    "hits": [
    ...
    ],
    "max_score": 2
  }
}

which is the same result I get performing the query in the Kibana Console.

There could be some subtle differences between the json query and how you mapped it in the client that I'm failing to see due to my unfamiliarity with Kotlin, to check this you could convert the query to json using

JsonpUtils.toJsonString(yourQuery, esClient._jsonpMapper());

and check that the output is exactly the same as the one in Kibana.

Topic		Replies	Views
Trouble building query from Java Elasticsearch	2	559	June 12, 2017
Elasticsearch boost Elasticsearch	3	476	September 6, 2019
Elasticsearch Java API for function_score query Elasticsearch	6	5495	July 6, 2017
[function_score] malformed query, expected [END_OBJECT] but found [FIELD_NAME] Elasticsearch	1	916	September 17, 2020
How to covert this elastic search functional score query to java API using ES 2.3.3 Elasticsearch	1	867	September 29, 2017

ElasticSearch Java Client FunctionScoreQuery score mismatch with Console score

Related topics