How to handle "ef" and "num_candidates" parameters in hnsw search

Hi team,

I am in the process of learning how to use ANN search (with HNSW) on Elasticsearch: in order to do so I am comparing the results I obtain with Elasticsearch and the faiss implementation of the algorithm (using the IndexHNSWFlat index).
I understand and know how to set the parameters M and ef_construction using index_options.
However, there should be another parameter, called simply ef, which is similar to ef_construction, but used during the search operations: they talk about it in the original HNSW article (pg. 4 and 5) and it is called ef_search in the faiss code.
I did not find a way to set ef, since it does not appear in index_options: how to do so? Also, what is its default value?

There is another parameter which troubles me: num_candidates. In the knn-search guide it is written as follows:

To gather results, the kNN search API finds a num_candidates number of approximate nearest neighbor candidates on each shard. The search computes the similarity of these candidate vectors to the query vector, selecting the k most similar results from each shard. The search then merges the results from each shard to return the global top k nearest neighbors.

In my case, since I am just running tests to compare Elasticsearch and Faiss results, I am running on the Elasticsearch docker image with 1 shard: am I right to assume that in this case num_candidates parameter is irrelevant?
In all the examples I always see num_candidates to be 10 times k: is there a reason for that?

Thank you very much

@wole Thanks for digging into KNN search!

You are correct on the parallel values between M and ef_construction. At search time ef indicates how many candidates to consider while gathering your top K.

In Elasticsearch, instead of ef, we provide num_candidates. So, in comparing with FAISS, where they use ef at search time, you should use the same value in num_candidates

1 Like