Using inner_hits in knn search with rrf

I'm relatively new to elasticsearch and have a knn query that uses inner_hits to retrieve nested passages within a document. I want to combine this knn query with a matching query as part of a hybrid search, but I'm getting errrors.

I see on the rrf documentation (here) it should be possible, but looking at the documentation for the knn retriever for rrf (here) it doesn't mention inner_hits as one of the params.

Am I misunderstanding the capabilities of rrf?

Whenever I try running a query I get an error saying:

{
  "error": {
    "root_cause": [
      {
        "type": "x_content_parse_exception",
        "reason": "[27:13] [knn] unknown field [inner_hits]"
      }
    ],
    "type": "x_content_parse_exception",
    "reason": "[27:27] [rrf] failed to parse field [retrievers]",
    "caused_by": {
      "type": "x_content_parse_exception",
      "reason": "[27:13] [knn] unknown field [inner_hits]"
    }
  },
  "status": 400
}

Here is the query I am running:

{
  "retriever": {
    "rrf": {
      "retrievers": [
        {
          "standard": {
            "query": {
              "match": {
                "extracted_text": {
                  "query": "${query}"
                }
              }
            }
          }
        },
        {
          "knn": {
            "k": 4,
            "field": "passages.passage_embedding.predicted_value",
            "num_candidates": 100,
            "query_vector_builder": {
              "text_embedding": {
                "model_id": "intfloat__multilingual-e5-large",
                "model_text": "${query}"
              }
            },
            "inner_hits": {
              "size": 4,
              "_source": {
                "includes": [
                  "passages.text",
                  "passages.title",
                  "passages.url"
                ]
              }
            }
          }
        }
      ],
      "rank_constant": 1,
      "rank_window_size": 10
    }
  }
}

Hi @gregp

I have my doubts if rff works with inner_hits only with the nested query. Maybe someone from the Elastic team can confirm this.

Sorry, indeed knn retriever doesn't support inner_hits. I've created an issue to add this functionality .

Meanwhile, as a workaround instead of a knn retriever, you can use standard retriever with a nested knn query, like this:

{
  "retriever": {
    "standard": {
      "query": {
        "nested": {
          "path": "paragraph",
          "inner_hits": {
            "size": 4,
            "_source": {
              "includes": [
                "paragraph.text"
              ]
            }
          },
          "query": {
            "knn": {
              "query_vector": [0.45, 45],
              "field": "paragraph.vector",
              "k": 2
            }
          }
        }
      }
    }
  }
}
1 Like

Thank you! I will look into using this!