Hello everyone! I'm currently working with Elasticsearch to be able to produce some features to use as input for my machine learning model. One of the features consists on a cosineSimilarity between a user's rating embeddings multiplied by it's associated score. My current query:
{
"query": {
"script_score": {
"script": {
"source": "double maxVal = 0.0; int index=0;
for(int i = 0; i < params.ratings.length; i++) {
double sim = cosineSimilarity(params.ratings[i]['embedding'], 'titleEmbedding') + 1;
if (sim > maxVal) {
maxVal = sim;
index = i;
}, return maxVal * params.ratings[index]['score']"
"params": {"ratings":[
{
"embedding": [embedding A],
"score": 25
}...
]
}
}
}
}
}
As you can see, the idea is to capture the maxVal, or the max cosine similarity score, and multiply it by an associated rating score. That should be it. But for some reason, the logic is only performing the cosineSimilarity with the first embedding, so embedding in position 0 in the ratings, and then it multiplies by the nth associated score. That behaviour is wrong. Does someone knows what could be happening?