How to search when there are 2 fields with dense vectors

I am working on implementing semantic search. I was able to implement where there is only one text field for which we can create dense vector. Can anyone please suggest how to achieve when there are more than one text fields for which semantic search to be implemented.

I have created mapping as below and was able to index the documents successfully. But I am stuck now how to write search query which considers both Text_1 and Text_2 for semantic search.

es_index = {
                "mappings": {
                    "properties": {
                        "Text_1": {
                            "type": "text"
                        },
                        "Text_2": {
                            "type": "text"
                        },
                        "Text_Vector_1": {
                            "type": "dense_vector",
                            "dims": 768
                        },
                        "Text_Vector_2": {
                            "type": "dense_vector",
                            "dims": 768
                        }
                    }
                }
            }

es.indices.create(index='my_index', body=es_index, ignore=[400])

Below code works fine when I consider only one field for dense vector representation. Need guidance on to implement this functionality when there are multiple fields which needs vectorisation.

GET my_index/_search
{
  "size" : 5,
  "query": {
    "script_score": {
      "query" : {
        "match_all" : {}
      },
      "script": {
        "source": "cosineSimilarity(params.queryVector, doc['Text_Vector1']) + 1.0",
        "params": {
          "queryVector": [4, 3.4, -0.2]
        }
      }
    }
  }
}

Painless script gives you a flexibility to combine cosine scores from two fields anyway you like. For example this script will sum up the scores:

"source": "cosineSimilarity(params.queryVector, doc['Text_Vector1']) + cosineSimilarity(params.queryVector, doc['Text_Vector2'])  + 2.0",

A more important question is what is the right way to combine scores. This is dependant on your application. Some people choose to build a single vector for a whole document that consists of several fields.

thanks mayya. I went ahead with 2nd option i.e. building single vector for a while document that consists of several fields.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.