Unknown key for a START_OBJECT in [knn]

Has anybody actually used the knn sample in the documentation here:k-nearest neighbor (kNN) search | Elasticsearch Guide [master] | Elastic ?

Here is their sample:

POST image-index/_search
{
  "query": {
    "match": {
      "title": {
        "query": "mountain lake",
        "boost": 0.9
      }
    }
  },
  "knn": {
    "field": "image-vector",
    "query_vector": [54, 10, -2],
    "k": 5,
    "num_candidates": 50,
    "boost": 0.1
  },
  "size": 10
}

Here is my code based on that:

      hybrid_lex_sem = self.es.search(index=indexn, body= {
                                          "query": {
                                            "match": {
                                              "question": {
                                                "query": cleanq,
                                                "boost": 0.9
                                              },
                                            },
                                          },
                                          "knn": {
                                            "field": "question_vector",
                                            "query_vector": question_embedding,
                                            "k": 5,
                                            "num_candidates": 50,
                                            "boost": 0.1
                                          },
                                          "size": 10
                                        }
                                      )

Here is the parsing exception I get from the above (Python)

'parsing_exception', 'reason': 'Unknown key for a START_OBJECT in [knn].'

The fields in my query all work fine with a regular dense vector search(non knn).

Can somebody direct me where the syntax error is in my version of the documentation's own sample? Thank you very much.

2 Likes

Hi @Scott_M

What is your version ES?

It is 8.5, I believe latest?

I tested with successful in version 8.4. I imagined you using version minor.

Thank you very much. I tried 8.4 version now and got same error. Much appreciated though, that was a good idea!

Must be something wrong in my syntax, because it cannot even parse my query:

'parsing_exception', 'reason': 'Unknown key for a START_OBJECT in [knn].'

What is your python client version? The search body looks OK at first glance. Maybe the python client is doing something unexpected, I know in later versions of the client, you specify each part of the search body instead of the whole thing.

Maybe something like:

hybrid_lex_sem = self.es.search(index=indexn, query= {
                                            "match": {
                                              "question": {
                                                "query": cleanq,
                                                "boost": 0.9
                                              },
                                            },
                                          }
                                          knn= {
                                            "field": "question_vector",
                                            "query_vector": question_embedding,
                                            "k": 5,
                                            "num_candidates": 50,
                                            "boost": 0.1
                                          },
                                          size= 10
                                      )

Thank you for great idea to try. Really appreciate. For some reason even with your excellent suggestion, I still get the same:

'parsing_exception', 'reason': 'Unknown key for a START_OBJECT in [knn].', 'line': 1, 'col': 8}

I wonder what "unknown key" really means... those dict key names like "field" and "query_vector" are taken from their own example, in the documentation.

RabBit_BR, you seem to be the only person in the known universe who's ever actually run this. Even the ES support people don't have an answer.

Could I trouble you to show me exactly how you called with that query? Did you use request/curl or straight python or how exactly did you get that to parse??

Thank you so much in advance!

Same Problem meet in dotnet:

Client: Elastic.Clients.Elasticsearch 8.0.1
Elasticsearch DB version: 8

How I made request in C# Code:

        var response = await _client.SearchAsync<ImageSearchDocument>(
            s => s
                .Knn(k => k
                    .QueryVector(embedding_vector) // ICollection<double>
                    .Field(_EmbeddingVectorStr) // string
                    .k(searchTotal) // long
                    .NumCandidates(numCandidate) // long
                 )
                .SourceExcludes(_EmbeddingVectorStr) // string
                .Index(FullImageSearchIndexName) //string
        );

Request Body

{
   "knn":{
      "field":"embedding_vector",
      "k":10,
      "num_candidates":20,
      "query_vector":[
      ]
   }
}

Error Response

{"error":{"root_cause":[{"type":"parsing_exception","reason":"Unknown key for a START_OBJECT in [knn].","line":1,"col":8}],"type":"parsing_exception","reason":"Unknown key for a START_OBJECT in [knn].","line":1,"col":8},"status":400}

This issue has confused me for 2 days. Any hints will be appreciated.

FIrst thing, I believe you better upgrade your deployment (on dashboard)
to 8.5.2 of the core Elasticsearch service ( and separately, make sure your client is the matching latest version, e.g. for me with Python that is 8.5)

Thank you for your reply. I double check that my ES is 8.3.2 and my es dotnet client(Elastic.Clients.Elasticsearch) is the last stable version 8.0.1
The Postman request successfully receive response with provided Request Body. Still trying to figure out why it doesn't work.

oh ok
Just curious why is your query_vector empty?

"query_vector":[
]

I remove the float number inside for visibility. (Previously is has a list of 256 float number)

@Scott_M I think the issue in your case is that you are using search. What I suggest is to use knn_search instead of it.

Below Python code works for me:

target_float_list = [...]// a list of 256 floating number
res = client.knn_search(
                index=client._active_index, knn={"field": "embedding_vector", "query_vector": target_float_list, "k": 10, "num_candidates": 100})   

My Request Body

{
   "knn":{
      "field":"embedding_vector",
      "query_vector":[],
      "k":10,
      "num_candidates":100
   },
   "_source":{
      "excludes":[
         "embedding_vector"
      ]
   }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.