Thanks for the explaining how merging works differently for knn-query and top-level-knn.
Coming back to
Will it still be useful to have a separate k parameter, if it is applicable only per shard base?
short response;
I think we should still keep support for separate k parameter. (even when at shard level). We should keep it for the very same reason why we have separate k parameter in top-level-knn. Because it not only helps during selecting top-k at coordinator node. But it is also use to select top-k within each shard.
not so short response;
For a moment lets keep Elasticsearch and sharding out of picture and lets see how search for most nearest-neighbours for given query_vector is performed in HNSW graph.
The HNSW graph consists of multiple layers (refer attached figure).
Step 1: The search begins by selecting a random entry point (current_vertex) at the highest layer.
Step 2: Calculate the distance between the query_vector and the current_vertex, as well as all neighbors of the current_vertex.
-
Case 1: If the closest vertex is a neighbor of the current_vertex, greedily select that vertex and make it the current_vertex. If the current layer is not lowest layer goto step-2, othwerwise goto step-3
-
Case 2: If the closest vertex is the current_vertex and it’s not the lowest layer, move down one level and goto Step 2. If the current layer is the lowest layer, proceed to Step 3.
Step 3: Ultimately, the current_vertex at the lowest layer is identified as the nearest neighbor for the given query_vector.
reason
During the whole search process, after randomly selecting entry point we employed “greedy” strategy to select next current_vertex. And at each step it is using only local information to make decision. Due to this the selected vertex might not be the closet neighbour to query_vector. To overcome this HNSW suggests following:
Instead of finding only one nearest neighbour on each layer, the efSearch (i.e num_candidates) closest nearest neighbours to the query_vector are found and each of these neighbours is used as the entry point on the next layer.
reference => link
Returning to the context of Elasticsearch, each shard has its own HSNW graph. Suppose the goal is to find the 10 nearest neighbors from each shard. We should use more than 10 entry points (i.e., num_candidates/efSearch should be > 10) to ensure more precise vector matches. To enable this, we should add support for k (which would be set to 10 in this case).
