Elasticsearch Hybrid Query - No Results

I’m currently trying to do a hybrid search on two indexes: a full text index and knn_vector (word embeddings) index. Currently, over 10’000 documents from Wikipedia are indexed on an ES stack, indexed on both of these fields (see mapping: “content”, “embeddings”).

It is important to note that the knn_vector index is defined as a nested object.

This is the current mapping of the items indexed:

    mapping = {
            "settings": {
                "index": {
                    "knn": True,
                    "knn.space_type": "cosinesimil"
                }
            },
           "mappings": {
            "dynamic": 'strict', 
            "properties": {
                "elasticId": 
                    { 'type': 'text' },
                "owners": 
                    { 'type': 'text' },
                "type": 
                    { 'type': 'keyword' },
                "accessLink": 
                    { 'type': 'keyword' },
                "content": 
                    { 'type': 'text'}, 
     	"embeddings": {
                    'type': 'nested', 
                    "properties": {
                      "vector": {
                        "type": "knn_vector", 
                        "dimension": VECTOR_DIM, 
                              },
                        },
     	},
    }

My goal is to compare the query scores on both indexes to understand if one is more efficient than the other (full text vs. knn_vectors), and how elastic chooses to return an object from based on the score of each index.

I understand I could simply split the queries (two separate queries), but ideally, we might want to use a hybrid search of this type in production.

This is the current query that searches on both full text and the knn_vectors:

def MakeHybridSearch(query):
    query_vector = convert_to_embeddings(query)
    result = elastic.search({
        "explain": True, 
        "profile": True, 
        "size": 2,
        "query": {
        "function_score": { #function_score
        "functions": [
            {
          "filter": { 
              "match": { 
                  "text": {
                      "query": query,
                      'boost': "5",  
                      }, 
                    }, 
                  },
            "weight": 2
          },
          {
          "filter": { 
              'script': {
                'source': 'knn_score',
                'params': {
                  'field': 'doc_vector',
                  'vector': query_vector,
                  'space_type': "l2"
                      }
                  }
                  },
                  "weight": 4
              }
          ],
          "max_boost": 5,
          "score_mode": "replace",
          "boost_mode": "multiply",
          "min_score": 5
          }
        }
      }, index='files_en', size=1000)

The current problem is that all queries are not returning anything.
Result:

{
"took": 3,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": {
        "value": 0,
        "relation": "eq"
    },
    "max_score": null,
    "hits": []
},

Even when the query does return a response, it returns hits with a score of 0 (score =0).

Is there an error in the query structure ? Could this be on the mapping side ? If not, is there a better of way of doing this ?

Thank you for your help !

What is ? Is this your custom plugin?

"knn": True,
"knn.space_type": "cosinesimil"

But regardless of this, here is a summary page how queries can be combined through compound queries.

function_score query allows to provide a single query (e.g. match query) and combine its _score with a set of functions. In your case you can have a single script_score function that outputs the score of a document based on knn calculations. You can also use _score from the query in your script in this script_score function, and write a custom script how they should be combined.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.