Python index CSV into ES date issue

Hi,

I use Python csv.DictReader to bulk insert CSV data into ES, i found that the date column was read as string.
I tried two methods to deal with the date column data type:

  1. pre-define the index template to set the column to type date
  2. create data mapping, create index using that mapping, and then bulk insert data into that index.

Both methods doesn't solve the date problem. Index was created but no data was inserted. Wonder, is this Elasticsearch issue or Python or data?

May date value is like so: 2013-11-01 17:30:35.960
There will be empty date which i had converted into NaT.

May i know how do i deal with date column while using Python to bulk insert CSV data?

mapping = {
  "mappings": {
      "properties": {
        "Activation": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "ActivationDate": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss.SSS"
        },
         "UserId": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
  }
}

#create index with the above mapping
response = es.indices.create(index=index_name, ignore=400, body=mapping)

#check status
print ('response:', response)

#start insert.
filepath = npath + '*' + extensions
#read all csv files from folder
for fname in glob.glob(filepath):
    print (fname) #to check csv file name
    with open(fname, encoding= "utf-8") as f:
        reader = csv.DictReader(f, delimiter = '|')
        helpers.bulk(es, reader, index=index_name, doc_type = '_doc')

I kinda 'solve' the problem by adding following in the mapping:

"ignore_malformed": "true"

However, when i create index pattern, i cannot set primary time field. If i set primary time field, i won't be able to display the data in Discover/Search.

Does anyone know what is happening when we set the primary time field ?

If you are ignoring malformed, and also not seeing anything in the primary timestamp field, then there's something wrong.

Why not print the documents you are sending to Elasticsearch into the console/stdout/log so you can check what they look like and go from there?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.