Improving runtime for dense embedding rescoring


I'm using script_score queries for rescoring search results using query and document embeddings. As expected, the runtime is notably longer than my baseline multi_match queries and I'm looking into ways to work around it.

The from and size params don't affect the runtime as the rescoring seems to happen before pagination.

It's stated in the doc: "During vector functions' calculation, all matched documents are linearly scanned. Thus, expect the query time grow linearly with the number of matched documents.For this reason, we recommend to limit the number of matched documents with a query parameter."

Is there a query or index parameter/setting to limit the number of matched docs for the query, without changing the query logic (eg. making it more specific)?
The max number of hits i get is 10k, but in my application i'm happy with 1k, which would also speed up the script_score.

I've also tried running script_score under the rescore which is notably faster. However, there in the docs it's stated: " when exposing pagination to your users, you should not change window_size as you step through each page (by passing different from values) since that can alter the top hits causing results to confusingly shift as the user steps through pages." But what other pagination options are available besides using from?


ps. I'm aware of the work on ANN in v8.x