Allow vector functions in script fields context

Denne · September 8, 2020, 11:51am

Hi!

I use for a very specific use case script fields for scoring different attributes of documents using custom painless scripts, where the query context is used for filtering non-relevant documents (in the example below replaced with a match_all query for simplicity).

Here is an example:

GET my_index/_search
{
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "attribute_1": {
      "script": {
        "lang": "painless",
        "source": "cosineSimilarity(params.query_vector, 'dense_vector') + 1.0",
        "params": {
          "query_vector": [ ... ]
        }
      }
    },
    "attribute_2": {
      "script": {
        "lang": "painless",
        "source": "...",
        "params": {
          "some_param":  ... 
        }
      }
    }
  }
}

This example does not work because cosineSimilarity isn't allowed in the script_fields context. The following error is returned:

...
"caused_by" : {
    "type" : "illegal_argument_exception",
    "reason" : "Unknown call [cosineSimilarity] with [2] arguments."
}

This makes it impossible to use vector embeddings (e.g. sentence embeddings) to calculate the similarity regarding different attributes and return these scores in the results.

Now my question: Why isn't this possible? I couldn't find a legitimate reason. Also: is there a workaround to get multiple scores for each hit using vector function?

mayya · September 18, 2020, 9:04pm

Indeed vector functions are only available in ScoreScript context. These functions are intended to be used for scoring documents. And the use-case you presented here is quite original. We have a relevant issue of exposing vector values in scripts, I can also add your request for something that we will consider.

is there a workaround to get multiple scores for each hit using vector function

No, you can either combine vector functions' outputs from multiple fields in a single painless score. Or another alternative is to issue multiple queries.

Denne · September 20, 2020, 9:10am

Hi and thanks for answering!

Yes, my use-case is probably more exotic than most but I think Elasticsearch is perfect for something like this. Matching/Filtering entities and also retrieving some sort of evaluation (in my case scores between 0 and 1 of different attributes) isn't to far off. Sometimes you need/want additional information about the hits.

We can take Tinder as an example since they also use Elasticsearch and it fits my use-case since I also use Elasticsearch for matching entities. If you want to show additional information about matches in the UI (e.g. how well a specific attribute matches), script fields is a nice way to achieve this. Since Machine Learning becomes more common every day vector functions like cosine similarity get used more frequently in something like this. Allowing to use these in the script fields context makes sense in my opinion. Maybe there would be a downside I'm missing though.

You mentioned multiple queries as an alternative: This can be a solution but when response times become an important factor (e.g. Tinder) making multiple requests might take too long.

I don't quite get how this would work. How would I be able to get scores from multiple fields using vector functions and be able to separate them afterwords?

I also read the issue on github. If you'd like I can take part of the discussion on there if it's easier for you guys.

mayya · September 21, 2020, 1:32pm

I don't quite get how this would work. How would I be able to get scores from multiple fields using vector functions and be able to separate them afterwords?

You are right, it is not possible to separate individual scores – a script allows you to combine output from multiple vector functions into a single score.

I also read the issue on github. If you'd like I can take part of the discussion on there if it's easier for you guys.

I have added your request to the github issue. Feel free to participate in it as well.

system · October 19, 2020, 1:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Script_score query with cosineSimularity on alias Elasticsearch	1	201	January 17, 2023
Execute Cosine Similarity inside script_score function if field is present in the document Elasticsearch painless	4	1293	December 18, 2020
Exactly which documents are used for vector calculation Elasticsearch	3	584	November 12, 2019
Using cosineSimilarity function inside aggregation scripts Elasticsearch	3	617	August 9, 2022
Script Score Query Cosine Similarity Elasticsearch	4	2194	August 8, 2019

Allow vector functions in script fields context

Related topics