Improving runtime for dense embedding rescoring


I'm using script_score queries for rescoring search results using query and document embeddings. As expected, the runtime is notably longer than my baseline multi_match queries and I'm looking into ways to work around it.

The from and size params don't affect the runtime as the rescoring seems to happen before pagination.

It's stated in the doc: "During vector functions' calculation, all matched documents are linearly scanned. Thus, expect the query time grow linearly with the number of matched documents.For this reason, we recommend to limit the number of matched documents with a query parameter."

Is there a query or index parameter/setting to limit the number of matched docs for the query, without changing the query logic (eg. making it more specific)?
The max number of hits i get is 10k, but in my application i'm happy with 1k, which would also speed up the script_score.

I've also tried running script_score under the rescore which is notably faster. However, there in the docs it's stated: " when exposing pagination to your users, you should not change window_size as you step through each page (by passing different from values) since that can alter the top hits causing results to confusingly shift as the user steps through pages." But what other pagination options are available besides using from?


ps. I'm aware of the work on ANN in v8.x

The max number of hits i get is 10k, but in my application i'm happy with 1k, which would also speed up the script_score.

I am not sure what it means. I assume you are interested in the top scored documents from the whole document collection. To find the top scored documents, Elasticsearch first finds all the matched documents and them uses script to score all matched documents to find the top scored documents among them. Are you not interested to calculate scores for all matched documents? _search endpoint has a parameter terminate_after, but in this case, you are not guaranteed to find the true top scored documents.

About rescore, you can use it, just make sure you don't change your window_size as you paginate. So you need to think in advance how many pages of results you want to display and have window_size parameter big enough to accommodate all these results. Another alternative is to implement pagination on your application level, just get a big enough results set at once (with a big enough size parameter), and then paginate though it in your application. An advantage for it is that a query will be executed only once instead of multiple times: each time for each page.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.