I want to use aggregations on the hybrid search (query + knn), which will give me some facets that I can select in the UI and use as filters for subsequent queries. I'm using num_candidates=100 and k=20.
I read in the documentation that aggregations are calculated on the top k nearest documents. If it includes query, aggregations are calculated on the combined set of knn and query matches.
I'm seeing some count mismatches happening when the filters are applied.
Scenario 1:
Hybrid search with no filters applied - total count (50)
Input: Black sports shoes
aggregations:
Nike - 20
Adidas - 15
Puma - 15
Scenario 2:
Faceted Hybrid search with 1 filter - total count (30)
selected filter: Nike
From the above observation, when I selected the Nike brand and set it as a filter for the Hybrid search query, it gave more results than the initial count (20) from the aggregations result. Is this because the prefilter is happening on the "brand" field and searching only on those nearest documents which is pulling up more records?
I want to make the count result consistent even after applying the filters so that users won't be confused about the total number of results found.
Is there something wrong I'm doing or something I'm missing? Please suggest to me how to handle this scenario