Hamming Distance on Binary strings


(anahap) #1

Hi there,

Does anyone know the best way to store fixedlength binary data and query it,
while scoring with hamming distance?
A hamming distance filter with a threshold would also be ok.

Thanks a lot, this is useful for all kinds of similarity searches based on
fingerprinting algorithms.


(Shay Banon) #2

You can use fuzzy queries for Levenshtein distance, but note that they are
slow(er) in Lucene 3.3, will be much faster in Lucene 4.0 (when it comes
out).

On Fri, Aug 26, 2011 at 1:50 PM, anahap andy@nahapetian.com wrote:

Hi there,

Does anyone know the best way to store fixedlength binary data and query
it, while scoring with hamming distance?
A hamming distance filter with a threshold would also be ok.

Thanks a lot, this is useful for all kinds of similarity searches based on
fingerprinting algorithms.


(Catalin Banu) #3

Hi,

Did you find a solution?

On Friday, August 26, 2011 1:50:51 PM UTC+3, anahap wrote:

Hi there,

Does anyone know the best way to store fixedlength binary data and query
it, while scoring with hamming distance?
A hamming distance filter with a threshold would also be ok.

Thanks a lot, this is useful for all kinds of similarity searches based on
fingerprinting algorithms.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4