How can I do dot product in Elastic Search?
Let's say my indexed documents in the form:
{
"name": "...",
"features": [
{"f_i": 0.xxx},
...
{"f_j": 0.xxx}
]
}
For example, one document could be:
{
"name": "img1",
"features": [
{"f1": 0.6},
{"f3": 0.4}
]
}
Another document could be
{
"name": "img2",
"features": [
{"f2": 0.1},
{"f3": 0.5},
{"f5": 0.4}
]
}
Basically, the features field of each document is a sparse vector where we only display the elements that have values > 0. Assuming that we know the whole length of the sparse vector, let's say 100.
Now I want to query a new document to find the one in the index that has the most similar features, by which I measure by taking the dot product of 2 vectors. Say my query document is:
{
"name": "img",
"features": [
{"f3": 0.2},
{"f4": 0.4},
{"f5": 0.1},
{"f6": 0.3}
]
}
then the dot product with the 1st example above only happens at f3 so it is 0.4 x 0.2 = 0.08, while the dot product with the 2nd example above happens at f3 and f5 so it is 0.5 x 0.2 + 0.4 x 0.1 = 0.14. Thus the query would give higher score for 2nd document.
How do I do that? Thanks
P/S: I had another topic asking about searching for nearest neighbor in ES, now I figure out that that process should be done outside of ES, thus now the remaining problem can be stated clearer like this.