i'm trying to calculate the L2-Norm between the fields of my database. To validate the correctness I setup a testing database with two documents, which are having equal vectors.
My db look like this:
Documents
_index:2vecstest
_type:_doc
_id:0
_score:None
_source:
imageid:0
score:1
gpd
[0, 0, 1, 1]
mask:1111
tempvec
[0, 0, 1, 1]
sort
[0]
_index:2vecstest
_type:_doc
_id:1
_score:None
_source:
imageid:1
score:1
gpd
[0, 0, 0, 1]
mask:1111
tempvec
[0, 0, 0, 1]
sort
[1]
Then I want to calculate the distance between gpd (dense_vector) and tempvec (double-field) of each document. For this I wrote this request:
request = { "size": size,
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"lang":"painless",
"source": """
return l2norm(doc['tempvec'], doc['gpd']) + 1;
"""
,
}
}
}
}
try :
res = es.search(index=_INDEX, body=request)
except elasticsearch.ElasticsearchException as es1:
...
I would expect both scores to be maximum, since within each doc the vectors are equal. However I'm getting following search result:
took:18
timed_out:False
_shards:
total:1
successful:1
skipped:0
failed:0
hits:
total:
value:2
relation:eq
max_score:2.0
hits
_index:2vecstest
_type:_doc
_id:1
_score:2.0
_source:
imageid:1
score:1
gpd
[0, 0, 0, 1]
mask:1111
tempvec
[0, 0, 0, 1]
_index:2vecstest
_type:_doc
_id:0
_score:1.0
_source:
imageid:0
imid:0
score:1
gpd
[0, 0, 1, 1]
mask:1111
tempvec
[0, 0, 1, 1]
So strangely the second document is ranked lower.
What am I doing wrong here? I don't found an answer in the elasticsearch documentation.
Greetings,
Christian