I am trying to import data from a CSV that has several date column with the format of "6/21/2012" which my mapping reflects as
"type": "date",
"format": "M/d/yyyy"
Note: There are empty strings as for some values in the date columns, I don't think these are being treated as null while importing.
While trying to import with the above date mappings I receive this error:
Index:
test_index
Documents ingested:
41
Failed documents:
4909
Some documents could not be imported:
4950 out of 4950 documents could not be imported. This could be due to lines not matching the Grok pattern.
elasticsearch.log is showing:
org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [DATE] of type [date]
.....
...
..
.
Caused by: java.lang.IllegalArgumentException: Invalid format: ""
The only way i can get it to ingest is by passing " "index.mapping.ignore_malformed": true" in the index settings. This seems kinda of lazy as I want to be informed about other malformed mappings..
I haven't tried this, but you could pretty easily; given the mapping on the date field, if you index a document with a value of an empty string (""), do you see the same error? And also, do you see the same error when you index a document that simply doesn't contain that field?
I think that's where that error comes from, Elasticsearch expects a string in the M/d/yyyy format, because that's what you set in the mapping, but it's getting an empty string instead, and that doesn't follow that rule. ES supports sparse data, but enforces mappings, so the solution here, aside from using ignore_malformed may be to just exclude the field in the document when you index it, instead of using an empty string.
I am curious too, what are you using to index the CSV data?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.