Hi,
I am trying to calculate the K Nearest Neighbours of a Vector at querytime.
I have structured my data as Follows:
{
"vector": [
0,
0,
0,
0.3004656841866144,
0.028906494022773094,
0,
0.001937984496124031,
0.051182667461737226,
0,
0.001937984496124031,
0.003875968992248062,
0,
0,
0.007751937984496124,
0.001937984496124031,
0.5068361870687452,
0.06256921373200443,
0,
0.032597893063009344,
0
],
"id": "79583a25d73cb2e82acaf29cdeb28b85"
}
And I have an input vector as:
[
0,
0.0014124293785310737,
0,
0.41034832330964105,
0,
0,
0,
0.004237288135593221,
0,
0.0014124293785310737,
0,
0,
0,
0.2172134135228723,
0.04737903225806452,
0,
0.008474576271186442,
0,
0.29198104610898495,
0.017541461636595593
]
My Requirement is to get the nearest neighbours to the input vectors based on Cosine Similarity. I am working on Elasticsearch 6.0.0. I have about a million vector records out of which I have to pick the nearest neighbours.
Any help is appreciated.
Thanks in Advance!