Inconsistent hybrid search hits total results lead to incorrect aggregations

Hi All,

My team is working on system using hybrid search combining knn search and full-text queries. There is a usecase that we need to count a tag field data which we are using Term Aggregation and display these counts on UI and support filtering result by tag.

And then we found that in this guide Aggregation with kNN which state that for approximate kNN search, aggregations are calculated on the top k nearest documents - this may messed up the aggregation counting results as the kNN search will make sure that k matching documents with tag-filter are returned - which make the aggregation of first search request without filtering incorrect

We do some research and try to apply kNN query and although it may sounds matching our needs, but the hits total results are inconsistent between search requests with the same request body - which lead to aggregation results are incorrect too.

We don't understand this behavior, we find that when we keep only one vector knn search or small size , this will not happen again.

Can your guys help me explain what is happening in background here and any solutions/workarounds to implement term aggregation and filter by it with hybrid search. Any suggestions or opinions are appreciated. Thank you.

Below are my queries;

This is hybrid search with kNN

GET product_books/_search
{
  "size": 200,
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "match_phrase": {
                  "product_title": {
                    "query": "Hello World",
                    "boost": 3.27
                  }
                }
              },
              {
                "match_phrase": {
                  "product_title_kana": {
                    "query": "Hello World",
                    "boost": 2.84
                  }
                }
              },
              {
                "match_phrase": {
                  "author_1": {
                    "query": "Hello World",
                    "boost": 2.44
                  }
                }
              },
              {
                "match_phrase": {
                  "author_1_kana": {
                    "query": "Hello World",
                    "boost": 2.3
                  }
                }
              },
              {
                "match_phrase": {
                  "author_2": {
                    "query": "Hello World",
                    "boost": 2.44
                  }
                }
              },
              {
                "match_phrase": {
                  "author_2_kana": {
                    "query": "Hello World",
                    "boost": 2.3
                  }
                }
              },
              {
                "match_phrase": {
                  "search_text.search_text": {
                    "query": "Hello World",
                    "boost": 1.04
                  }
                }
              },
              {
                "match_phrase": {
                  "search_text.search_ngram": {
                    "query": "Hello World",
                    "boost": 0.24
                  }
                }
              },
              {
                "match_phrase": {
                  "search_text.search_ngram_norm": {
                    "query": "Hello World",
                    "boost": 0.17
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "product_title": {
                    "query": "Hello World",
                    "boost": 1.64
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "product_title_kana": {
                    "query": "Hello World",
                    "boost": 1.44
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "author_1": {
                    "query": "Hello World",
                    "boost": 1.2
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "author_1_kana": {
                    "query": "Hello World",
                    "boost": 1.14
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "author_2": {
                    "query": "Hello World",
                    "boost": 1.2
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "author_2_kana": {
                    "query": "Hello World",
                    "boost": 1.14
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "search_text.search_text": {
                    "query": "Hello World",
                    "boost": 0.5
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "search_text.search_ngram": {
                    "query": "Hello World",
                    "boost": 0.14
                  }
                }
              },
              {
                "match_phrase_prefix": {
                  "search_text.search_ngram_norm": {
                    "query": "Hello World",
                    "boost": 0.1
                  }
                }
              },
              {
                "match": {
                  "product_title": {
                    "query": "Hello World",
                    "boost": 0.67
                  }
                }
              },
              {
                "match": {
                  "product_title_kana": {
                    "query": "Hello World",
                    "boost": 0.57
                  }
                }
              },
              {
                "match": {
                  "author_1": {
                    "query": "Hello World",
                    "boost": 0.5
                  }
                }
              },
              {
                "match": {
                  "author_1_kana": {
                    "query": "Hello World",
                    "boost": 0.47
                  }
                }
              },
              {
                "match": {
                  "author_2": {
                    "query": "Hello World",
                    "boost": 0.5
                  }
                }
              },
              {
                "match": {
                  "author_2_kana": {
                    "query": "Hello World",
                    "boost": 0.47
                  }
                }
              },
              {
                "match": {
                  "search_text.search_text": {
                    "query": "Hello World",
                    "boost": 0.2
                  }
                }
              },
              {
                "match": {
                  "search_text.search_ngram": {
                    "query": "Hello World",
                    "boost": 0.04
                  }
                }
              },
              {
                "match": {
                  "search_text.search_ngram_norm": {
                    "query": "Hello World",
                    "boost": 0.03
                  }
                }
              },
              {
                "term": {
                  "jan": {
                    "value": "Hello World",
                    "boost": 3.27
                  }
                }
              },
              {
                "term": {
                  "isbn_10": {
                    "value": "Hello World",
                    "boost": 3.27
                  }
                }
              }
            ],
            "minimum_should_match": 6
          }
        }
      ],
      "filter": [
        {
          "term": {
            "tags": "コミック"
          }
        }
      ]
    }
  },
  "knn": [
    {
      "field": "semantic_description_vector",
      "query_vector_builder": {
        "text_embedding": {
          "model_id": "intfloat__multilingual-e5-base_query",
          "model_text": "Hello World"
        }
      },
      "k": 15,
      "num_candidates": 50,
      "boost": 28,
      "filter": [
        {
          "term": {
            "tags": "コミック"
          }
        }
      ]
    },
    {
      "field": "semantic_description_vector",
      "query_vector_builder": {
        "text_embedding": {
          "model_id": "intfloat__multilingual-e5-base_query",
          "model_text": "Hello World"
        }
      },
      "k": 15,
      "num_candidates": 50,
      "boost": 28,
      "filter": [
        {
          "term": {
            "tags": "コミック"
          }
        }
      ]
    },
    {
      "field": "semantic_metadata_vector",
      "query_vector_builder": {
        "text_embedding": {
          "model_id": "intfloat__multilingual-e5-base_query",
          "model_text": "Hello World"
        }
      },
      "k": 15,
      "num_candidates": 50,
      "boost": 28,
      "filter": [
        {
          "term": {
            "tags": "コミック"
          }
        }
      ]
    }
  ]
}

This is hybrid search with query DSL kNN query :

GET product_books/_search
{
  "size": 200,
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "should": [
                    {
                      "match_phrase": {
                        "product_title": {
                          "query": "Hello World",
                          "boost": 3.27
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "product_title_kana": {
                          "query": "Hello World",
                          "boost": 2.84
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "author_1": {
                          "query": "Hello World",
                          "boost": 2.44
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "author_1_kana": {
                          "query": "Hello World",
                          "boost": 2.3
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "author_2": {
                          "query": "Hello World",
                          "boost": 2.44
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "author_2_kana": {
                          "query": "Hello World",
                          "boost": 2.3
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "search_text.search_text": {
                          "query": "Hello World",
                          "boost": 1.04
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "search_text.search_ngram": {
                          "query": "Hello World",
                          "boost": 0.24
                        }
                      }
                    },
                    {
                      "match_phrase": {
                        "search_text.search_ngram_norm": {
                          "query": "Hello World",
                          "boost": 0.17
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "product_title": {
                          "query": "Hello World",
                          "boost": 1.64
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "product_title_kana": {
                          "query": "Hello World",
                          "boost": 1.44
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "author_1": {
                          "query": "Hello World",
                          "boost": 1.2
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "author_1_kana": {
                          "query": "Hello World",
                          "boost": 1.14
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "author_2": {
                          "query": "Hello World",
                          "boost": 1.2
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "author_2_kana": {
                          "query": "Hello World",
                          "boost": 1.14
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "search_text.search_text": {
                          "query": "Hello World",
                          "boost": 0.5
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "search_text.search_ngram": {
                          "query": "Hello World",
                          "boost": 0.14
                        }
                      }
                    },
                    {
                      "match_phrase_prefix": {
                        "search_text.search_ngram_norm": {
                          "query": "Hello World",
                          "boost": 0.1
                        }
                      }
                    },
                    {
                      "match": {
                        "product_title": {
                          "query": "Hello World",
                          "boost": 0.67
                        }
                      }
                    },
                    {
                      "match": {
                        "product_title_kana": {
                          "query": "Hello World",
                          "boost": 0.57
                        }
                      }
                    },
                    {
                      "match": {
                        "author_1": {
                          "query": "Hello World",
                          "boost": 0.5
                        }
                      }
                    },
                    {
                      "match": {
                        "author_1_kana": {
                          "query": "Hello World",
                          "boost": 0.47
                        }
                      }
                    },
                    {
                      "match": {
                        "author_2": {
                          "query": "Hello World",
                          "boost": 0.5
                        }
                      }
                    },
                    {
                      "match": {
                        "author_2_kana": {
                          "query": "Hello World",
                          "boost": 0.47
                        }
                      }
                    },
                    {
                      "match": {
                        "search_text.search_text": {
                          "query": "Hello World",
                          "boost": 0.2
                        }
                      }
                    },
                    {
                      "match": {
                        "search_text.search_ngram": {
                          "query": "Hello World",
                          "boost": 0.04
                        }
                      }
                    },
                    {
                      "match": {
                        "search_text.search_ngram_norm": {
                          "query": "Hello World",
                          "boost": 0.03
                        }
                      }
                    },
                    {
                      "term": {
                        "jan": {
                          "value": "Hello World",
                          "boost": 3.27
                        }
                      }
                    },
                    {
                      "term": {
                        "isbn_10": {
                          "value": "Hello World",
                          "boost": 3.27
                        }
                      }
                    }
                  ],
                  "minimum_should_match": 6
                }
              }
            ]
          }
        },
        {
          "knn": {
            "field": "semantic_title_vector",
            "query_vector_builder": {
              "text_embedding": {
                "model_id": "intfloat__multilingual-e5-base_query",
                "model_text": "query: Hello World"
              }
            },
            "_name": "knn_query"
          }
        },
        {
          "knn": {
            "field": "semantic_description_vector",
            "query_vector_builder": {
              "text_embedding": {
                "model_id": "intfloat__multilingual-e5-base_query",
                "model_text": "query: Hello World"
              }
            },
            "_name": "knn_query"
          }
        },
        {
          "knn": {
            "field": "semantic_metadata_vector",
            "query_vector_builder": {
              "text_embedding": {
                "model_id": "intfloat__multilingual-e5-base_query",
                "model_text": "query: Hello World"
              }
            },
            "_name": "knn_query"
          }
        }
      ]
    }
  },
  "aggs": {
    "tags": {
      "terms": {
        "field": "tags",
        "size": 100
      }
    }
  }
}

This pretty much the same to this topic Inconsistency in No of Total Hits but this is hybrid search to our usecase

For Elasticsearch version, we are using ElasticCloud with version 8.14.3 of Elasticsearch

I am experiencing the same behaviour with hybrid queries and ES 8.15 (using ElasticCloud as well). The knn query clause specifies numCandidates=100. Query executions range between 99 and 101 hits.

Yes, this is a bit confusing behavior as far as i know knn will make ES choosing num_candidates from each shard during the search and choose the top k results. For the same keyword, the num_candidates should remain the same and thus the results are consistent.

I have just find out that the knn top-level section results will be also inconsistent when putting high num_candidates and k. But not as much as the knn query, the result from knn top-level section fluctuates around 1-2 documents.

For the aggregation, we figure out the way to manage to do this is that we are using Post Filter - it took us a while to figure this docs - I think this section should be included in KNN search filter in case any ones are struggling with the same problem we have met.