Limit script_score rescoring to the top query matches

Hi!

I'm trying to score some documents via their a dense_vector field, via a script_score query.

The cosine computation is relatively expensive, so I'd like to limit it to the best scoring results of the "query" clause. This would allow me to set an upper bound to the documents that will be rescored.

Right now I can't find any option for that, because the "size" parameter - in my understanding - gets applied to the results of the "script_score" itself, and not to the results of the "query" clause within it.

Any ideas? Thanks!

Take a look at rescoring, that might be what you are after.

Thank you for the pointer!

So, as far as I understand it, the way to go would be to have the query structured as follows:

    {
        "query" : {
            # actual query goes here
        },
        "rescore":{
            "window_size": 100, # maximum number of rescored docs per shard
            "query": {
                "rescore_query": {
                    "script_score": {
                        "query": {
                            "match_all": {} # dummy query to catch all previous results
                        },
                        "script": {
                            # actual rescoring happens here
                        }
                    }
                }
            }
        }
    }

So I can not have a hard cap on the maximum documents to be rescored, but I can have a hard cap on the maximum number of documents rescored per shard.

Correct? :slight_smile:

indeed, rescoring happens on the shard level.