ScriptEngine - ScoreScript : cosine similarity

HI,
using an updated version (running on 6.5.4) of:

when running on a single index, all the scores are "correct" between 0-1 (double) as cosine similarity should be, but if running over 2+ indexes by using alias or manually specifying them i get scores above 1, i specified: "boost_mode": "replace" to get only the script score.

thought first it was because not normalized vectors, but when running of each index i got correct results.

am i missing something?

query example:
POST 2018*/_search

{
"min_score": 0.6,
"size": 1000,
"query": {
"function_score": {
"boost_mode" : "replace",
"functions": [
{
"script_score": {
"script": {
"source": "staysense",
"lang" : "fast_cosine",
"params": {
"field": "embeddedVector",
"cosine": true,
"encoded_vector" : "v+kopYAAAAA/wivkYAAAAD+wfJeAAAAAv8DL4QAAAAA/waYiwAAAAL+zAmvAAAAAv8c+aiAAAAC/07MyQAAAAL+ccr9AAAAAP9feCOAAAAC/y+ivYAAAAL/R34XgAAAAv+G8nuAAAAA/09hlwAAAAL/MkSWAAAAAP9EXn4AAAAC/zBBxYAAAAD/UY+3AAAAAP7zQSkAAAAC/zRijgAAAAA=="
}
}
}
}
]
}
}
}

You may want to ask the authors / file an issue on that plugin's github page. I'm not sure that you'll find many folks here aware of it.

Also, just as an FYI, in the future Elasticsearch will be adding a vector field for this type of use case, which you can read about here

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.