Running ANN with num_candidates > 10000

Hi. Given millions of documents in our index, we would like to run ANN in order to get the top-X documents, where X is greater than 10,000. However, we're running into the num_candidates cannot exceed 10000 error.

We don't mind making multiple calls if needed to get more than 10k results. I've experimented with using search_after together with ANN and num_candidates=10000 - but this simply returns 0 results after the first 10k.

I know we can replace ANN with a brute-force exact-kNN, but we would prefer to use ANN for the performance benefits if possible.

Is there any way at all to use ANN to get more than 10k results?

@rajivhs

Currently there is not.

Are you wanting to page through many 100s of thousands of nearest neighbors? How big is the data set?

Paging doesn't really work with kNN as your nearest neighbors are exactly that and kNN has no natural filter as part of its retrieval (every doc is a nearest neighbor at some point).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.