I have a problem where elasticsearch doesn't return k matches for a knn search. It used to work before so I think something has changed between version 8.8.3 to 8.10.3. Perhaps a minimum score? I could not find it in the docs though.
Its the following case:
I have an index where the textEmbedding field has the following mapping:
"textEmbedding": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "cosine"
}
The index contains two documents.
Document 1:
{
"_index": "test_index_c64dcf58",
"_id": "author1",
"_source": {
"id": "author1",
// .....
"textEmbedding": [
-0.888888,
-0.888888,
// and so on... (they all are the same NEGATIVE number: -0.888888)
]
}
}
Document 2:
{
"_index": "test_index_c64dcf58",
"_id": "author2",
"_source": {
"id": "author2",
// .....
"textEmbedding": [
0.888888,
0.888888,
// and so on... (they all are the same POSITIVE number: 0.888888)
]
}
}
For the following search query I would expect to get both matches. Since there are only two documents, both of them should be among the k=10 best matches. However, it only returns 1 of the documents, namely the one with the negative vector values. The match that it does return makes sense because it is closest to the query vector.
GET /test_index_c64dcf58/_search
{
"size": 10,
"from": 0,
"knn": {
"field": "textEmbedding",
"k": 10,
"num_candidates": 100,
"query_vector": [
-0.777777,
-0.777777,
...and so on...
],
"filter": {
"bool": {
"must": [],
"must_not": []
}
}
},
"aggs": {}
}