I'm trying to understand how to best implement a pre-filtered KNN vector search. To be clear: I'd like to pre-filter the total number of docs down to a smaller set, and then run a vector search that leverages a vector index over the results.
Documentation describes a Filtered KNN Search with the following note: "The filter is applied during the approximate kNN search to ensure that k matching documents are returned."
The during phrase is unclear.
- Does this mean that it will apply a per-shared pre-filter?
- Or does this mean that it will do a KNN search on the all the data in the shard and then filter before the share returns results.. possibly getting more results if too many got filtered out?
If a filter is applied to a KNN search, Is the underlying vector index still being used or it is reverting to brute force?