I have a problem with my knn query. I am applying a knn search with a filter clause (pre-filter). This filter contains a range filter that filters on a date field in combination with some term or wildcard queries on keyword/wildcard fields. All of the filters return result sets with documents that have the dense_vector field populated. When I apply an approximate knn query with said filter clause, my cluster returns an empty set of results when num_candidates is less than the number of documents that match the given filter. If I increase num_candidates by 1 to be equal to the filtered document count, I get k results back as expected for any size of k that I choose. According to the documentation, if the num_candidates value is greater than or equal to the filtered document count, the search bypasses the HNSW graph and uses brute force search on the filtered documents. The behavior that I am experiencing suggests that the brute force search is working but approximate knn using the HNSW graph is failing for some reason. This behavior does not exist for all filters. Some filters return results as expected, while others do not. I cannot see a pattern that links the filters that succeed to one another. It seems almost random.
The profiles of the two queries can be found below. The query that returns the expected results uses a “DocAndScoreQuery” in the knn section and a “ConstantScoreQuery” in the searches section (with a “KnnScoreDocQuery” in its children list). The query that fails uses a “MatchNoDocsQuery” in both of these sections. It seems as though Elasticsearch has decided, using some metric, that we will get no results and is therefore returning nothing. I have no explanation for why it would make this decision.
Cluster information:
- Version: 8.7.1
- Nodes: 1
Index information
- Shards: 1
- Doc count: 458067 (4 095 of these do not have the required vector field for semantic search)
- Index size: 7.48GB
Mapping information:
The knn search is performed on a dense_vector field with the following properties:
- Dimension: 128
- Similarity: dot_product
- Excluded from source
I am currently unable to reproduce the bug anywhere except in this cluster. I have other clusters with the same ES version and mappings but different numbers of nodes and shards. I cannot get knn to fail in this way on another cluster.
Does anybody have any theories as to what might be causing this? Any theories or suggestions will be greatly appreciated.
Profile for unsuccessful query:
{
"id": "[A6OyFGpQRk-eOjezM8BZEQ][srhw-sms-2024-10-07][0]",
"dfs": {
"statistics": {
"type": "statistics",
"description": "collect term statistics",
"time_in_nanos": 5452,
"breakdown": {
"term_statistics": 0,
"collection_statistics": 0,
"collection_statistics_count": 0,
"create_weight": 3835,
"term_statistics_count": 0,
"rewrite_count": 0,
"create_weight_count": 1,
"rewrite": 0
}
},
"knn": [
{
"query": [
{
"type": "MatchNoDocsQuery",
"description": """MatchNoDocsQuery("")""",
"time_in_nanos": 813,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 0,
"match": 0,
"next_doc_count": 0,
"score_count": 0,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 0,
"advance_count": 0,
"count_weight_count": 0,
"score": 0,
"build_scorer_count": 16,
"create_weight": 226,
"shallow_advance": 0,
"count_weight": 0,
"create_weight_count": 1,
"build_scorer": 587
}
}
],
"rewrite_time": 2345906,
"collector": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time_in_nanos": 5420
}
]
}
]
},
"searches": [
{
"query": [
{
"type": "MatchNoDocsQuery",
"description": """MatchNoDocsQuery("User requested "match_none" query.")""",
"time_in_nanos": 673,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 0,
"match": 0,
"next_doc_count": 0,
"score_count": 0,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 0,
"advance_count": 0,
"count_weight_count": 0,
"score": 0,
"build_scorer_count": 16,
"create_weight": 301,
"shallow_advance": 0,
"count_weight": 0,
"create_weight_count": 1,
"build_scorer": 372
}
}
],
"rewrite_time": 278,
"collector": [
{
"name": "TotalHitCountCollector",
"reason": "search_count",
"time_in_nanos": 562
}
]
}
Profile of successful query:
{
"id": "[A6OyFGpQRk-eOjezM8BZEQ][srhw-sms-2024-10-07][0]",
"dfs": {
"statistics": {
"type": "statistics",
"description": "collect term statistics",
"time_in_nanos": 3637,
"breakdown": {
"term_statistics": 0,
"collection_statistics": 0,
"collection_statistics_count": 0,
"create_weight": 2396,
"term_statistics_count": 0,
"rewrite_count": 0,
"create_weight_count": 1,
"rewrite": 0
}
},
"knn": [
{
"query": [
{
"type": "DocAndScoreQuery",
"description": "DocAndScore[242]",
"time_in_nanos": 93470,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 7871,
"match": 0,
"next_doc_count": 242,
"score_count": 242,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 7277,
"advance_count": 22,
"count_weight_count": 0,
"score": 23583,
"build_scorer_count": 44,
"create_weight": 29494,
"shallow_advance": 0,
"count_weight": 0,
"create_weight_count": 1,
"build_scorer": 25245
}
}
],
"rewrite_time": 4218315,
"collector": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time_in_nanos": 60580
}
]
}
]
},
"searches": [
{
"query": [
{
"type": "ConstantScoreQuery",
"description": "ConstantScore(ScoreAndDocQuery)",
"time_in_nanos": 96408,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 451,
"match": 0,
"next_doc_count": 2,
"score_count": 0,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 9877,
"advance_count": 22,
"count_weight_count": 0,
"score": 0,
"build_scorer_count": 44,
"create_weight": 58238,
"shallow_advance": 0,
"count_weight": 0,
"create_weight_count": 1,
"build_scorer": 27842
},
"children": [
{
"type": "KnnScoreDocQuery",
"description": "ScoreAndDocQuery",
"time_in_nanos": 36738,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 184,
"match": 0,
"next_doc_count": 2,
"score_count": 0,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 8740,
"advance_count": 22,
"count_weight_count": 0,
"score": 0,
"build_scorer_count": 44,
"create_weight": 11966,
"shallow_advance": 0,
"count_weight": 0,
"create_weight_count": 1,
"build_scorer": 15848
}
}
]
}
],
"rewrite_time": 9404,
"collector": [
{
"name": "TotalHitCountCollector",
"reason": "search_count",
"time_in_nanos": 3273
}
]
}