Elasticsearch - How to find similarity between 2 arrays?

I have an array of integers stored in ES say : [1,2,3,4,5]
Now my query array is [1,2,3,7,8]
I want to be able to get a score of how similar the two arrays are.
For example,
Array in ES - [1,2,3,4,5]
Query 1 - [1,2,3,7,8]
Query 2 - [9,76,23,56,1,2]

Query 1 should have a higher score as it has more elements common to [1,2,3,4,5] as opposed to Query 2.

Thanks for the help in advance!

Hi,

I would run a bool query with each number in a shoud term , I think that naturaly
the document with the most term in common will have a better score.

With your Query 1:
Bool query / minimun should match = 1
[
--> should - term field (1),
--> should - term field (2),
--> should - term field (3),
--> should - term field (7),
--> should - term field (8)
]

bye,
Xavier

bye,
Xavier

Hey, thanks for the answer! I did try this. However, the problem is, if we had,
[1,2,3,3,3], it would have a higher score because of the 3s, than [1,2,3,4,90] which is not right because [1,2,3,4,90] is closer to [1,2,3,4,5] than [1,2,3,3,3].

Ok, may be it will be faster/easier to compute a score in you code than trying to do it in ES :wink:

Right! Was just trying to see if it was possible! Thanks! :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.