ElasticSearch: L2 norm script_score returns false distances

i'm trying to calculate the L2-Norm between the fields of my database. To validate the correctness I setup a testing database with two documents, which are having equal vectors.
My db look like this:

Documents
 _index:2vecstest
 _type:_doc
 _id:0
 _score:None
 _source:
         imageid:0
         score:1
         gpd
         [0, 0, 1, 1]
         mask:1111
         tempvec
         [0, 0, 1, 1]
 sort
 [0]
 _index:2vecstest
 _type:_doc
 _id:1
 _score:None
 _source:
         imageid:1
         score:1
         gpd
         [0, 0, 0, 1]
         mask:1111
         tempvec
         [0, 0, 0, 1]
 sort
 [1]

Then I want to calculate the distance between gpd (dense_vector) and tempvec (double-field) of each document. For this I wrote this request:

    request = { "size": size,
                "query": {
                    "script_score": {
                        "query": {
                            "match_all": {}
                        },
                        "script": {
                            "lang":"painless",
                            "source": """
                                return  l2norm(doc['tempvec'], doc['gpd']) + 1;
                            """
                            ,   
                        }
                    }
                }
              }

try :
     res = es.search(index=_INDEX, body=request)
except elasticsearch.ElasticsearchException as es1:  
...

I would expect both scores to be maximum, since within each doc the vectors are equal. However I'm getting following search result:

took:18
 timed_out:False
 _shards:
	 total:1
	 successful:1
	 skipped:0
	 failed:0
 hits:
	 total:
		 value:2
		 relation:eq
	 max_score:2.0
	 hits
	 _index:2vecstest
	 _type:_doc
	 _id:1
	 _score:2.0
	 _source:
		 imageid:1
		 score:1
		 gpd
		 [0, 0, 0, 1]
		 mask:1111
		 tempvec
		 [0, 0, 0, 1]
	 _index:2vecstest
	 _type:_doc
	 _id:0
	 _score:1.0
	 _source:
		 imageid:0
		 imid:0
		 score:1
		 gpd
		 [0, 0, 1, 1]
		 mask:1111
		 tempvec
		 [0, 0, 1, 1]

So strangely the second document is ranked lower.
What am I doing wrong here? I don't found an answer in the elasticsearch documentation.

Greetings,
Christian

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Hello Christian, hope this reply isn't coming too late. It's not actually supported for l2norm to load two vectors from a document. It's intended for comparing a query to a document vector. We updated the function syntax in 7.6 to make this clear. (I wonder if we accidentally allowed it before 7.6, which resulted in buggy output).

I hope that helps. The summary is that comparing two vectors from a single document isn't supported, and to watch out for a syntax change when upgrading to 7.6.

Julie

1 Like