Date_time_parse_exception Warning message in Logstash while fetching data from CSV file

Hi,

I am trying to fetch data from few csv file

input {
  file {
    path => "/root/API*"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    type => "API"
  }
}

filter {
  csv {
     separator => ","
     columns => ["TIMESTAMP_DERIVED","USER_ID_DERIVED","CLIENT_IP","URI_ID_DERIVED"]
}
}

output
{
 elasticsearch {
    hosts => ["xxx:443"]   //443 because of managed ES
    index => "apisalesforceapi-%{+YYYY.MM}"
       user => "xxx"
        password => "xxx"
        ilm_enabled => false   //Managed ES
}
        stdout { codec => rubydebug }
}

Here : "TIMESTAMP_DERIVED" -> is the field which is causing the issue.

Logstash is able to read the csv file and index gets created in Managed ES as well, however i end up getting the below WARN everytime

[2020-11-10T14:04:12,995][WARN ][logstash.outputs.elasticsearch][python][f1948bfb7238388c36f40fa69ae4943193b1e7dcc21fa07b7f74a47b7a0c1474] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"apisalesforceapi-2020.11", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0x6469834f>], :response=>{"index"=>{"_index"=>"apisalesforceapi-2020.11", "_type"=>"_doc", "_id"=>"44N4snUBvPvlSgV_p801", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [TIMESTAMP_DERIVED] of type [date] in document with id '44N4snUBvPvlSgV_p801'. Preview of field's value: 'TIMESTAMP_DERIVED'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"failed to parse date field [TIMESTAMP_DERIVED] with format [strict_date_optional_time||epoch_millis]", "caused_by"=>{"type"=>"date_time_parse_exception", "reason"=>"Failed to parse with all enclosed parsers"}}}}}}

Please help me resolve this !! Thank you !!

Can you share an example of lines of your input log ?
here the error means you need to apply a date filter to your date so it can be parsed correctly

You have a header row. elasticsearch expects TIMESTAMP_DERIVED to be a date and it cannot parse the string value TIMESTAMP_DERIVED as a date.

You might find the skip_header option on the csv filter helpful.

hi,

this is my csv file

"EVENT_TYPE","TIMESTAMP","REQUEST_ID","ORGANIZATION_ID","USER_ID","RUN_TIME","CPU_TIME","URI","SESSION_KEY","LOGIN_KEY","REQUEST_STATUS","DB_TOTAL_TIME","API_TYPE","API_VERSION","CLIENT_NAME","METHOD_NAME","ENTITY_NAME","ROWS_PROCESSED","REQUEST_SIZE","RESPONSE_SIZE","DB_BLOCKS","DB_CPU_TIME","TIMESTAMP_DERIVED","USER_ID_DERIVED","CLIENT_IP","URI_ID_DERIVED"
"API","20201108061648.336","4Zxx--","00xxx","00xxx","459xx","31","Api","Wbsxxx","hfRxxx","","32xxx","M","50.0","sfdx xxx","meta_retrieve","","","843","330","590","30","2020-11-08T06:16:48.336Z","0053Jxxx","103.xx.xx.xxx",""

above example i removed the column names as it will be more.

Issue seems only with TIMESTAMP_DERIVED

The other fields are strings, so elasticsearch will not care if it gets a value like "REQUEST_ID". date fields do care.

Yes adding skip_header in the filter part resolved the issue

  csv {
     separator => ","
     columns => ["EVENT_TYPE","TIMESTAMP","REQUEST_ID","ORGANIZATION_ID","USER_ID","RUN_TIME","CPU_TIME","URI","SESSION_KEY","LOGIN_KEY","REQUEST_STATUS","DB_TOTAL_TIME","API_TYPE","API_VERSION","CLIENT_NAME","METHOD_NAME","ENTITY_NAME","ROWS_PROCESSED","REQUEST_SIZE","RESPONSE_SIZE","DB_BLOCKS","DB_CPU_TIME","TIMESTAMP_DERIVED","USER_ID_DERIVED","CLIENT_IP","URI_ID_DERIVED"]
    skip_header => "true"
  }
}

Thank you so much !!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.