'type': 'illegal_argument_exception', 'reason': 'cannot parse empty date'}

Hi, I am using bulk api to inject pandas dataframe into elasticsearch index. I first converted dataframe to dict and then used bulk.
In my .csv data file , I have a column name "start_date", I have converted it into datetime using to_datetime and it has some empty rows, when I used bulk. I got this error:

'type': 'illegal_argument_exception', 'reason': 'cannot parse empty date'

After that I converted my empty rows i.e '' to pd.NaT using replace function, but still I am getting the same error.
Please help how to resolve this issue, and also, Is NaT while injecting into elasticsearch has some issue to be taken care of?
Thanks

It is unclear what the question is here: If you try to insert the empty string (or pd.NaT for that matter) in a date field, then it will complain that the input is not a date - which is indeed true.

The question for you is: What do you want elasticsearch to do for the lines where you have no data for "start_date"? One possible answer could be to just not have the field for the relevant documents - if that is what you want, then you must delete the "start_date" key from your dict for these documents: While elasticsearch will croak at attempting to index the dict {"other_data" : "blabla", "start_date": ""} (for example), it will be quite happy with the dict {"other_data" : "blabla"}, even if other documents do have the "start_date" field...

But, ultimately, the answer depends on what you want from a solution.

Hi @ftr , thanks for your response. I am assuming that in missing date rows if I put pd.NaT, so then when I inject into elasticsearch I should not get any error. Is the assumption correct?

No. You must make certain that, for every row in your input data, one or the other of the following statements is true:

  1. There is no key named "start_date" in the dictionary representing the row

or

  1. The value for the key "start_date" is a valid date, as defined by your mapping (defaults to, IIRC, ISO8601 or milliseconds since the epoch)

If any row in your indata does not fulfill one of these, your ingestion will fail.

Hi @ftr thanks for response. I understood what you have explained, and I made changes in my python code accordingly and it's working.
Thanks
Saurabh :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.