JornWildt
(Jørn Wildt)
October 11, 2024, 8:32am
1
Doing a KNN search on an empty index fails with "All shards failed".
A normal search with a GET like this succeeds with zero its:
https://host/my-index/_search
But if I do a POST like this, ES fails with status code 400 "All shards failed":
{
"fields": [
{
"field": "text"
}
],
"knn": {
"field": "vectorIndex",
"k": 13,
"num_candidates": 26,
"query_vector": [1,2,3,4,5,6,7,...],
"similarity": -0.24000001
},
"size": 13
}
I would expect the KNN search to succeed with status code 200 and zero hits.
As it is now, the application using ES returns "error" to the end user though it should return "No results" because of the status code 400.
Is there any way to avoid this? Is it a bug or a feature?
dadoonet
(David Pilato)
October 11, 2024, 8:48am
2
What if you run the knn
part as a query instead?
GET /_search
{
"query": {
"knn": {
// ...
}
}
}
JornWildt
(Jørn Wildt)
October 11, 2024, 12:01pm
3
Same result with GET:
GET https://HOST/vector-index/_search HTTP/1.1
Host: HOST
Accept: application/vnd.elasticsearch+json;compatible-with=8
User-Agent: elasticsearch-net/8.15.2+974651520349bd64e6c4755da6f8eed1a4bd6b52 (Microsoft Windows 10.0.22631; .NET 8.0.10; Elastic.Clients.Elasticsearch)
x-elastic-client-meta: es=8.15.2,a=1,net=8.0.10,so=8.0.10,t=0.4.22+b65498cbdc63d96c1653577b4706b74bf3c20179
Authorization: ApiKey XXX
traceparent: 00-42a5813a14052c77129faf3873ca446e-bd585f06235aad38-00
Content-Type: application/vnd.elasticsearch+json;compatible-with=8
Content-Length: 9656
{"fields":[{"field":"text"}],"knn":{"field":"vectorIndex","k":19,"num_candidates":38,"query_vector":[...],"similarity":-0.25},"size":19}
Output:
HTTP/1.1 400 Bad Request
Server: openresty
Date: Fri, 11 Oct 2024 11:58:17 GMT
Content-Type: application/vnd.elasticsearch+json;compatible-with=8
Content-Length: 857
Connection: keep-alive
X-elastic-product: Elasticsearch
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to create query: Cannot invoke \"java.lang.Integer.intValue()\" because \"this.dims\" is null","index_uuid":"KgM1GR66RN6xl7zInIBj4w","index":"f2dev12-expert-environment-chunk-index"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"dfs","grouped":true,"failed_shards":[{"shard":0,"index":"vector-index","node":"85NOPUP4QsWu31m9ieaOng","reason":{"type":"query_shard_exception","reason":"failed to create query: Cannot invoke \"java.lang.Integer.intValue()\" because \"this.dims\" is null","index_uuid":"KgM1GR66RN6xl7zInIBj4w","index":"vector-index","caused_by":{"type":"null_pointer_exception","reason":"Cannot invoke \"java.lang.Integer.intValue()\" because \"this.dims\" is null"}}}]},"status":400}
JornWildt
(Jørn Wildt)
October 11, 2024, 12:02pm
4
But "GET"? Really? You are not supposed to pass any HTTP body to a GET request.
BenTrent
(Ben Trent)
October 11, 2024, 12:13pm
5
What version are you using @JornWildt ? stack trace indicates this is a bug for sure.
EDIT:
Found the issue & the PR that fixes it: vector search error when the target index is empty · Issue #111733 · elastic/elasticsearch · GitHub
elastic:main
← kderusso:kderusso/fix-dims-null-error
opened 03:08PM - 09 Aug 24 UTC
Resolves https://github.com/elastic/elasticsearch/issues/111733
Script to re… produce (slightly expanded from bug report issue):
```
DELETE esre-team-test-chunks
PUT esre-team-test-chunks
{
"mappings": {
"dynamic_templates": [],
"properties": {
"metadata": {
"properties": {
"esre": {
"properties": {
"created_at": {
"type": "date"
},
"embedding_model": {
"type": "text"
},
"file_id": {
"type": "text"
},
"team_id": {
"type": "text"
},
"updated_at": {
"type": "date"
}
}
},
"source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"text": {
"type": "text",
"fields": {
"ngram": {
"type": "text"
}
}
},
"text_embedding": {
"type": "dense_vector",
"index": true,
"similarity": "cosine"
}
}
}
}
PUT _inference/text_embedding/my-e5-model
{
"service": "elasticsearch",
"service_settings": {
"num_allocations": 1,
"num_threads": 1,
"model_id": ".multilingual-e5-small"
}
}
GET /esre-team-test-chunks/_search
{
"fields": [
"text"
],
"knn": {
"field": "text_embedding",
"k": 10,
"num_candidates": 100,
"query_vector_builder": {
"text_embedding": {
"model_id": ".multilingual-e5-small",
"model_text": "test"
}
}
}
}
GET /esre-team-test-chunks/_search
{
"fields": [
"text"
],
"knn": {
"field": "text_embedding",
"k": 10,
"num_candidates": 100,
"query_vector": [ 1.0, 2.0, 3.0 ]
}
}
```
BenTrent
(Ben Trent)
October 11, 2024, 12:14pm
6
But "GET"? Really? You are not supposed to pass any HTTP body to a GET request.
ES has supported GET
with a body for about a decade now, that is unlikely to change. But you can use POST
if you are more comfortable with that
1 Like
dadoonet
(David Pilato)
October 11, 2024, 12:47pm
8
That's not the request I mentioned. Whatever GET
or POST
verb.
JornWildt
(Jørn Wildt)
October 11, 2024, 1:07pm
9
Sorry, I misunderstood you. Anyway - it seems to be fixed in 8.15.1. See Elasticsearch version 8.15.1 | Elasticsearch Guide [8.15] | Elastic (bottom).