I would like to report a conceptual inconsistency in the following article:
Issue summary
In the section describing binary quantization and similarity scoring, the article states:
“Simple binary quantization will transform D into 10101101 and Q into 11111011. For hamming distance, we need direct bit math—it's extremely fast. In this case, the hamming distance is 01010110, which is 86. So, scoring then becomes the inverse of that hamming distance.”
This description conflates Hamming distance with the XOR bitmask interpreted as an integer, which are not the same thing.
Technical clarification
Given:
-
D = 10101101
-
Q = 11111011
The XOR result is:
D XOR Q = 01010110
However:
-
The true Hamming distance is the number of differing bits, which here is 4
-
The value 86 is the integer interpretation of the XOR bitmask, not the Hamming distance
Using 1 / 86 ≈ 0.012 therefore inverts the XOR mask value, not the Hamming distance.