If we have an index with at least two nesting levels (i know it is not optimal in ES, but we need to have it) , and the dense vectors are in the second level. I can get from the results, the top level fields (they are unique by hit) or the low level ones from the inner hits. But I can't find a way to retrieve the fields of the middle level.
Let me set up an example,
This is the mapping:
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"chapters": {
"type": "nested",
"properties": {
"number": {
"type": "integer"
},
"c_vector": {
"type": "dense_vector",
"dims": 3,
"similarity": "cosine"
},
"paragraphs": {
"type": "nested",
"properties": {
"content": {
"type": "text"
},
"p_vector": {
"type": "dense_vector",
"dims": 3,
"similarity": "cosine"
}
}
}
}
}
}
}
}
So, we have a book with chapters and paragraphs, chapters also have dense vectors, but they are not strictly needed for this topic, but will help to illustrate it.
So, searching for paragraphs, I can do this query:
{
"knn": {
"query_vector": [1,2,3],
"field": "chapters.c_vector",
"k": 3,
"num_candidates": 3,
"inner_hits": {
"_source": false,
"fields": [ "chapters.number" ]
}
},
"fields": ["title","chapters.number"]
"_source": false
}
In the results,
- from the HIT, i get the title and the the list of chapters,
- From the inner hit , i get the top matching chapter
ok, that's fine
But now, If i want to search paragraphs, the query would be
{
"knn": {
"query_vector": [0,2,2],
"field": "chapters.paragraphs.p_vector",
"k": 3,
"num_candidates": 3,
"inner_hits": {
"_source": false,
"fields": ["chapters.paragraphs.content","chapters.number"]
}
},
"fields": ["title","chapters.number"],
"_source": false
}
So, I can get the title, and the in inner hits I get the paragraph with highest score, but... How can I know to which chapter this paragraph belongs?
in the outer fields, i get all the chapters of the book,
Adding "chapters.number" in the knn -> inner_hits has no effect.
In regular searches , an nested search can be embedded into another nested search, and each may return it's own inner_hits, but I don't think this is possible in knn searches.
Summarizing: When there are multiple levels on nesting, and the knn search is done in the lower levels, how can the middle levels of a hit be obtained?