Script_score behavior depends on number of requested documents

Gkleinereva · February 6, 2023, 10:54pm

TL;DR - I encountered a problem with script_score behavior. script_score seems to calculate scores differently depending on the size parameter in the query. Can anyone help me understand what's going on here (and whether or not this is a bug)? Suggestions of alternatives that may work better in this case are also appreciated (though I have looked at "Script" queries and "Runtime fields" and don't think these will work).

Suppose that I create the following index

PUT /testindex
{
  "mappings": {
    "properties": {
      "priority": {
        "type": "float"
      }
    }
  }
}

and then index some documents

POST /testindex/_doc/1
{
  "priority": 5.5
}
POST /testindex/_doc/2
{
  "priority": 3
}

For whatever reason (this example is far more contrived than my actual use case), I now would like to use the following script_score query to search for all documents that have a "priority" of at least 2.5:

GET /testindex/_search
{
  "size": 5,
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        {
          "script_score": {
            "script": {
              "source": "return(doc['priority'].value)"
            },
            "query": {
              "exists": {
                "field": "priority"
              }
            },
            "boost": 2,
            "min_score": 5
          }
        }
      ]
    }
  }
}

As expected, the script calculates each document's score, then boosts it, and finally checks to ensure that the resultant score is higher than the min_score. In our case, this results in 2 hits being found by the query.

Now, run that same query again, but without retrieving any documents (set "size" to 0, as has been done in the following version of the query). In my case, I'm only interested in aggregation results in certain senarios.

GET /testindex/_search 
{
  "size": 0,
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        {
          "script_score": {
            "script": {
              "source": "return(doc['priority'].value)"
            },
            "query": {
              "exists": {
                "field": "priority"
              }
            },
            "boost": 2,
            "min_score": 5
          }
        }
      ]
    }
  }
}

This query actually finds a different number of hits! It only matches the document with a "priority" of 5. It seems to me that there is a sequencing issue in the script_score calculation when no hits are being returned in the response.

If you made it through this and still feel like helping me, thanks a ton!

Gkleinereva · February 13, 2023, 9:03pm

Does anyone have thoughts on this? I can create a github issue instead if that's a better place for this kind of question/reproduction. Thanks!

system · March 13, 2023, 9:03pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Script_score with field that does not exist filters out all documents Elasticsearch	3	1647	July 5, 2017
_score doesn't update in a function score query with script Elasticsearch	11	2714	November 23, 2018
Script_score query does not support [query] Elasticsearch	1	649	October 11, 2021
Script_score will be calculated multiple times Elasticsearch	2	532	October 9, 2019
Script score expressions in Elasticsearch 1.4 Elasticsearch	2	1677	July 5, 2017

Script_score behavior depends on number of requested documents

Related topics