I want to do hamming distance on binary strings in elasticsearch, and I want to control the distance.
In my case, the binary strings have a length of 256 and I want everything with a hamming distance of 32 or below.
Is this possible using fuzzy searching? Looking at the docs, the fuzziness parameter for fuzzy queries only accepts values; 0, 1, 2 or AUTO.
Apparently with AUTO I can set low and high distance values, so perhaps: AUTO:0,32 would work?
Can someone confirm this? Is anyone else doing this? If so, how?
I believe the reason for capping this is performance. I do not have any suggested way to do what you want so will leave that for others.
You may be able to do it using a script query or a runtime field, but I am not sure. this would have an impact on performance though, especially at scale.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.