Index mapping problem

My current mapping is like:

        self.es_client.indices.create(
            index=self.index_name,
            mappings={
                "properties":{
                    text_field: {"type": "text"},
                    dense_vector_field: {"type": "dense_vector"},
                    num_characters_field: {"type": "integer"}
                }
            }
        )

And it works well, however, when I try to add a property like:

        self.es_client.indices.create(
            index=self.index_name,
            mappings={
                "properties":{
                    text_field: {"type": "text"},
                    title_field: {"type": "text"},
                    dense_vector_field: {"type": "dense_vector"},
                    num_characters_field: {"type": "integer"}
                }
            }
        )

After the docs are indexed, the dense_vector_field turns to text type, thus I can not perform knn search, why is that?

the mappings after i add a new property on kibana:

{
  "mappings": {
    "properties": {
      "num_characters": {
        "type": "integer"
      },
      "output_dense_vector": {
        "type": "text"
      },
      "text": {
        "type": "text"
      },
      "title": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
  }
}

Are you sure the JSON doc you're sending is using the field name that you declared as dense_vector?

Your mappings show a output_dense_vector field which you haven't declared in your mappings. Your mapping declares a dense_vector_field field.

Also elasticsearch should be declaring an array of floats above 100, so your field being declared as text is a little suspicious. Check to see the doc dense vector value you're inserting isn't quoted.

See docs Dense vector field type | Elasticsearch Guide [8.15] | Elastic

yep so its just a naming thing really, so my whole function is below:

    def create_index(self,text_field = "text",title_field = "title",dense_vector_field = "output_dense_vector",num_characters_field = "num_characters"):
        # es_client = connect_elasticsearch()
        self.es_client.indices.create(
            index=self.index_name,
            mappings={
                "properties":{
                    text_field: {"type": "text"},
                    title_field: {"type": "text"},
                    dense_vector_field: {"type": "dense_vector"},
                    num_characters_field: {"type": "integer"}
                }
            }
        )

also, my mapping on kibana BEFORE i added the title field is:

{
  "mappings": {
    "properties": {
      "num_characters": {
        "type": "integer"
      },
      "output_dense_vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "cosine"
      },
      "text": {
        "type": "text"
      }
    }
  }
}

so the embedding part should not be any problem, so , is there any other point I might missed? Thanks for the doc tho, but I've checked that already.

Some few things to check IMO:

  • What is the response of self.es_client.indices.create() call?
  • Do you delete the index BEFORE trying to recreate it?
  • If not, you need to call the PUT mapping API instead
  • Please note that the field names are not the same between BEFORE and what you are sending. Not sure if it's what you want or not.
1 Like