Hi there
To my knowledge, Unlike Lucene, Elasticsearch supports multi-valued byte arrays hence it
embeds byte arrays by prefixing it with the number of byte arrays embedded
and number of bytes remaining.
In my code there is a problem. I am iterating through the documents for the purpose of calculating "vector distance" between the documents. Here is the sample of code where all hell breaks loose.
@Override
public double score(int docId, float subQueryScore) throws IOException {
if (!values.advanceExact(docId)) {
throw new ElasticsearchException(
"Missing BinaryDocValue for field [" + binaryFeatureVector.field() + "]");
}
BytesRef x = values.binaryValue();
BytesRef other = decodeEmbeddedByteArray(x,docId); // HERE IS A MISTAKE!
//after that it proceeds to calculate distance between vectors..
}
values.binaryValue()
returns something like this -> [1 40 fe ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff]
then I want to print x.bytes
, I get this ...> 0140FEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0140FCFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF01...
Basically I get binary values for a lot of documents, I dont understand what is going on?
EDIT: Also when use System.out.println("x.lenght = " + x.length);
I get 66
How is this possible??