Script_score behavior depends on number of requested documents

TL;DR - I encountered a problem with script_score behavior. script_score seems to calculate scores differently depending on the size parameter in the query. Can anyone help me understand what's going on here (and whether or not this is a bug)? Suggestions of alternatives that may work better in this case are also appreciated (though I have looked at "Script" queries and "Runtime fields" and don't think these will work).

Suppose that I create the following index

PUT /testindex
{
  "mappings": {
    "properties": {
      "priority": {
        "type": "float"
      }
    }
  }
}

and then index some documents

POST /testindex/_doc/1
{
  "priority": 5.5
}
POST /testindex/_doc/2
{
  "priority": 3
}

For whatever reason (this example is far more contrived than my actual use case), I now would like to use the following script_score query to search for all documents that have a "priority" of at least 2.5:

GET /testindex/_search
{
  "size": 5,
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        {
          "script_score": {
            "script": {
              "source": "return(doc['priority'].value)"
            },
            "query": {
              "exists": {
                "field": "priority"
              }
            },
            "boost": 2,
            "min_score": 5
          }
        }
      ]
    }
  }
}

As expected, the script calculates each document's score, then boosts it, and finally checks to ensure that the resultant score is higher than the min_score. In our case, this results in 2 hits being found by the query.

Now, run that same query again, but without retrieving any documents (set "size" to 0, as has been done in the following version of the query). In my case, I'm only interested in aggregation results in certain senarios.

GET /testindex/_search 
{
  "size": 0,
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        {
          "script_score": {
            "script": {
              "source": "return(doc['priority'].value)"
            },
            "query": {
              "exists": {
                "field": "priority"
              }
            },
            "boost": 2,
            "min_score": 5
          }
        }
      ]
    }
  }
}

This query actually finds a different number of hits! It only matches the document with a "priority" of 5. It seems to me that there is a sequencing issue in the script_score calculation when no hits are being returned in the response.

If you made it through this and still feel like helping me, thanks a ton! :slight_smile:

Does anyone have thoughts on this? I can create a github issue instead if that's a better place for this kind of question/reproduction. Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.