Hello I am doing reverse image search using elasticsearch. I have hashes stored in index and now I am trying to find similar hashes(to compensate compression and what not) using Query String fuzziness.
my code for search is:
var searchResponse = await Program._elasticclient.SearchAsync<IndexedImage>
s => s.Index("images").Query(q => q.QueryString(qs => qs.FuzzyMaxExpansions(150).Fuzziness(Fuzziness.EditDistance(150)).Fields(f => f.Field(ff => ff.imagehash)).Query(imagehash))).Size(10000)
imagehash field is string holding hash(~600 character long number)
When I am trying to find similar strings it works fine but it's not working well at some cases and I am wondering whats wrong.
For example, this is original hash and when searching it returns 2 results from DB:
This is hash of similar image(levenshtein distance is 52) and when searched db returns 0 results:
When I took original hash and replaced bunch of numbers with same amount of 9s I got levenshtein distance 59 and search returned 2 results as original. hash:
all hashes I have stored and ones I am searching are always same in length.
Similarity can be compared properly here: https://countwordsfree.com/comparetexts
Can anyone point me to right direction and tell me what am I doing wrong here? Thanks in advance