Rescoring the output of a knn-query hybrid retrieval

According to the kNN documentation:

You can perform hybrid retrieval by providing both the knn option and a query. This search finds the global top k = 5 vector matches, combines them with the matches from the match query, and finally returns the 10 top-scoring results. The knn and query matches are combined through a disjunction, as if you took a boolean or between them... The score of each hit is the sum of the knn and query scores.

And according to the rescore documentation:

The query rescorer executes a second query only on the Top-K results returned by the query and post_filter phases

What is the expected behavior if we include knn, query and rescore all at the same time? Will the rescore take as input the top window_size results from the query alone? Or will it take as input the top results from the query-knn hybrid?

Our desired behavior is to rescore the top 400 results from the hybrid-output. If simply specifying knn, query and rescore does not do this, is there some other way to accomplish this?

Hey @rajivhs ,

Here are the steps that occur if using rescore in conjunction with knn and a query.

  • First, the K nearest neighbors are found. This is a global K across all shards.
  • Those document scores are combined with the query and the query is executed against all the shards
  • Per shard, with the combined score, rescore is called on the TOP window_size documents.

This has the following consequences:

  • k is NOT dynamically increased to match window_size.
  • window_size is PER shard, and k is a global top set of values. Meaning, it could be that a given shard didn't have any neighbors that were within the global top k, and do not contribute to the document score on that shard.
  • Rescore will run on the results of query and knn. But it may be that the top scoring documents from query dominate the knn scores. Thus you don't really see any knn score contribution in the top documents.
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.