Dense_vector becomes type "float"

Hi,
I'm trying to load vector_value into elastic search,
and loading finished without any error.
But the result of mapping shows type "float", not "dense_vector".
The version of Elasticsearch is 7.5.0.

{
  "test" : {
    "mappings" : {
      "properties" : {
        "text" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "text_feature" : {
          "type" : "float"
        }
      }
    }
  }
}

I'm using following python code for loading.

import csv
import os
import sys
from elasticsearch import Elasticsearch, helpers
import yaml
from my_vectorizer import MySentenceVectorizer
VEC = MyVectorizer()

def create_index(index='test'):
    es = Elasticsearch()

    setting = yaml.load(open('./mapping.yaml'), Loader=yaml.SafeLoader)
    properties = setting['mappings']['properties']

    def generate_data():
        with open('./text.csv', 'r') as f:
            reader = csv.reader(f)
            attrs = next(reader)
            for lid, row in enumerate(reader):
                data = {
                    '_op_type': 'index',
                    '_index': index,
                }
                for j, value in enumerate(row):
                    if attrs[j] in properties:
                        data[attrs[j]] = value
                    if attrs[j] == 'text':
                        data['text_vector'] = VEC.vectorize(value).tolist()

                yield data

    print(helpers.bulk(es, generate_data()))

create_index()

The type of variable " VEC.vectorize(value).tolist()" is surely list object of float.
And my yaml settings is as follows.

settings:
  index:
    analysis:
      analyzer:
        my_analyzer:
          type: custom
          tokenizer: kuromoji_tokenizer
          filter:
            - kuromoji_baseform
mappings:
  properties:
    text:
      type: text
      index: true
      fielddata: true
      analyzer: my_analyzer
    text_feature:
      type: dense_vector
      dims: 768

The vector value is generated by BERT vectorizer.
I'm completely stuck in this situation.
I hope someone help me.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.