Elastic KNN search questions

I am looking into using Elastic KNN search feature and from what I see this is how we query ES for KNN search.

GET my-index/_knn_search
{
  "knn": {
    "field": "image_vector",
    "query_vector": [0.3, 0.1, 1.2],
    "k": 10,
    "num_candidates": 100
  },
  "_source": ["name", "file_type"]
}

Here num_candidates has a max limit of 10000 and from ES documents I see this - The number of nearest neighbor candidates to consider per shard. Cannot exceed 10,000. Elasticsearch collects num_candidates results from each shard, then merges them to find the top k results. Increasing num_candidates tends to improve the accuracy of the final k results.

The above is not very clear to me. Here are some questions:

  1. How is the 10,000 candidates chosen?
  2. If we have 1M vector documents, to search across all of these should we pick like 100 shards, so that each shard has max 10k documents? We need very pretty good recall on the retrieved results.
  3. Their documents on picking shard strategy says too many smaller shards is bad and they have their own overhead. So how do we pick shard sizes when we have this limit of 10k candidates per shards for KNN?

Any advice/suggestions are appreciated, thanks.

Hey @AdarshPrabhakara ,

  1. How is the 10,000 candidates chosen?

num_candidates is the same idea as efSearch. Its the number of candidates we continue to keep track of while searching the HNSW graph per shard. This number is applied per shard.

  1. If we have 1M vector documents, to search across all of these should we pick like 100 shards, so that each shard has max 10k documents? We need very pretty good recall on the retrieved results.

I would say not. 1M vectors should fit in a single shard. HNSW is really good at providing high recall even in larger graphs.

So how do we pick shard sizes when we have this limit of 10k candidates per shards for KNN?

I would say you shouldn't.

If you are wanting 100% recall, then you probably don't want to index the vectors at all and just use brute-force. But keep in mind this scales linearly, where HNSW scales logarithmically and provides much faster query speeds.

Awesome, that is helpful. Thank you. I did not find this clarity in any official documentation.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.