I am looking into using Elastic KNN search feature and from what I see this is how we query ES for KNN search.
GET my-index/_knn_search
{
"knn": {
"field": "image_vector",
"query_vector": [0.3, 0.1, 1.2],
"k": 10,
"num_candidates": 100
},
"_source": ["name", "file_type"]
}
Here num_candidates has a max limit of 10000 and from ES documents I see this - The number of nearest neighbor candidates to consider per shard. Cannot exceed 10,000. Elasticsearch collects num_candidates results from each shard, then merges them to find the top k results. Increasing num_candidates tends to improve the accuracy of the final k results.
The above is not very clear to me. Here are some questions:
- How is the 10,000 candidates chosen?
- If we have 1M vector documents, to search across all of these should we pick like 100 shards, so that each shard has max 10k documents? We need very pretty good recall on the retrieved results.
- Their documents on picking shard strategy says too many smaller shards is bad and they have their own overhead. So how do we pick shard sizes when we have this limit of 10k candidates per shards for KNN?
Any advice/suggestions are appreciated, thanks.