I've got this use case (examples here are simplified to the essentials) where I want to do a knn search on multiple vectors of the same nested document inside a larger document, and be able to distinguish which of the nested documents was responsible for the hit.
When executing the search request, an error occurs. Seems like the problem occurs when trying to combine both of these inner_hits parts.
Does anyone know a way to make this work, or if this is a bug that is (going to be) solved in newer versions?
Below you can find the error, followed by mappings and query involved.
Currently using elastic v8.11.1
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "[inner_hits] already contains an entry for key [paragraphs]"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "dfs_query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "aem_pages_nl_blue",
"node": "AW2Ds3ckQnaYS9B-H34pUw",
"reason": {
"type": "illegal_argument_exception",
"reason": "[inner_hits] already contains an entry for key [paragraphs]"
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "[inner_hits] already contains an entry for key [paragraphs]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "[inner_hits] already contains an entry for key [paragraphs]"
}
}
},
"status": 400
}
Required part of template mappings (there's more to the documents in reality):
{
"mappings": {
"properties": {
"websiteSection": {
"type": "keyword"
},
"paragraphs": {
"type": "nested",
"properties": {
"documentText": {
"type": "text"
},
"documentTitle": {
"type": "text"
},
"textEmbedding": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "dot_product"
},
"titleEmbedding": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "dot_product"
}
}
}
}
}
}
The search query potentially is a hybrid query, but not in all cases. Whether it is or isn't has no effect on the result.
{
"knn": [
{
"field": "paragraphs.titleEmbedding",
"query_vector": [
"1536 floating point numbers left out for simplicity"
],
"k": 20,
"num_candidates": 50,
"filter": [
{
"term": {
"websiteSection": "forum"
}
}
],
"inner_hits": {
"_source": [
"paragraphs.documentTitle"
],
"fields": [
"paragraphs.documentTitle"
]
}
},
{
"field": "paragraphs.textEmbedding",
"query_vector": [
"1536 (maybe different from previous) floating point numbers left out for simplicity"
],
"k": 20,
"num_candidates": 50,
"filter": [
{
"term": {
"websiteSection": "forum"
}
}
],
"inner_hits": {
"_source": [
"paragraphs.documentText"
],
"fields": [
"paragraphs.documentText"
]
}
}
]
}