Script score query does not pass correct _score to sort scripts when using field collapse

When using Elasticsearch 8.7.1, if you add a collapse parameter to a script_score query, it seems like references to the _score field in subsequent sort scripts refer not to the script_score's _score value, but some value that is perhaps related to the query portion of the results.

This does not happen if you do not use a field collapse. There, the script score is passed as _score to subsequent scripts used in the sort parameter.

I couldn't find other reports or documentation of this issue and am wondering if it's a bug.

Here is an example below -- unfortunately not reproducible since it's on our data, but hopefully it's simple enough to illustrate on a variety of datasets.

Expected behavior: the value from return _score in the sort script matches the score in the script_score script regardless of whether the collapse parameter is present when track_scores is set to true.

Actual behavior: The value from return _score in the sort script is equivalent to the value of the script_score score value when no collapse is present (as expected), but differs when a collapse is present (unexpected). When the collapse is used, the score seems perhaps based on the match_all query instead (all _score values in the sort script are 1).

EDIT: it seems like it might be equivalent to the problem mentioned in this GitHub issue: script sort with "collapse" failed to get_ score variable · Issue #87772 · elastic/elasticsearch · GitHub

A possible fix is mentioned. Does anyone know the version with the fix? I tested the same query out on ver 8.7.1 and 8.11.1 and noticed that the latter version had the behavior that I expected (passed along the right _score to the sort scripts).

POST fundraising-v20/_search
{
  "_source": {
    "includes": [
      "loanId",
      "isActionable",
      "isoCode",
      "partnerId"
    ]
  },
  "collapse": {
    "field": "isoCode"
  },
  "query": {
    "script_score": {
      "query":{
        "match_all": {}
      },
      "script": {
        "params": {},
        "lang": "painless",
        "source": """
          long x = 0;
          if (!doc['partnerId'].empty) {
              x = doc['partnerId'].value;
          }
          x;
        """
      }
    }
  },
  "sort": [
        {
      "_script": {
        "order": "desc",
        "script": {
          "params": {},
          "lang": "painless",
          "source": "return _score;"
        },
        "type": "number"
      }
    }
  ],
  "track_scores": true,
  "track_total_hits": true,
  "version": true
}

Hi @rschoenbeck , I played around a bit, and it looks like this has been fixed in 8.15.

I hope that helps.

I later noticed it was fixed in 8.11.1, which is the ES version used in the Spring Data Elasticsearch release that's ahead of our current one, so I think that's the path we'll take. Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.