Date parse error on only one record

I'm trying to import data from a csv into an ES index using logstash. In the CSV I have seperate date and time columns that I'm combining and then parsing it with the date filter plugin so it can use it as the @timestamp. There are 150500 records and all of pass and are correctly matched except for 1 record. Reviewing this record there is nothing obviously abnormal about it that would necessarily cause such an issue so I'm at a loss. I've tried deleting the index and rerunning logstash multiple times and each time the same record fails. The record is tagged with the _dateparsefailure tag and its @timestamp is the only one containing the upload time instead of the parsed.

I'm new to logstash so there's probably a better way to do this but I have a field called date that contains a "Date" like so "MM/dd/yyyy 12:00:00 AM" (yes every record is 12am) and a "Time" field like so "HH:mm". I pass the following filters:

truncate {
    fields => "Date"
    length_bytes => 10
mutate {
    add_field => { "DateTime" => "%{Date} %{Time}"}
    remove_field => [ "Date", "Time" ]
date {
    match => [ "DateTime", "MM/dd/yyyy HH:mm" ]
    timezone => "America/Los_Angeles"
    remove_field => [ "DateTime" ]

The record in question contains Date and Time like so 03/13/2016 12:00:00 AM,02:30.

I should also point out that I seem to get no error if I also output to stdout, it just shows the tags with _dateparsefailure.


I guess that the date filter fails because of the suffix " AM,02:30". Although I have not tried it I think the correct format definition would be(see here for details): MM/dd/yyyy HH:mm a,ZZ.

You can provide multiple formats for the date filter so LogStash tries both and chooses the correct one:

date {
    match => [ "DateTime", "MM/dd/yyyy HH:mm", "MM/dd/yyyy HH:mm a,ZZ" ]
    timezone => "America/Los_Angeles"
    remove_field => [ "DateTime" ]

Best regards

So because I truncate the Date field to 10 bytes the 12:00:00 AM is removed before DateTime is even created. The DateTime field comes out to this "DateTime" => "03/13/2016 02:30"

That time is not valid if you are on Eastern time. The time went directly from 02:00:00 to 03:00:01 because daylight savings began.

I have plenty of other records that occur at similar times like 03/12/2016 12:00:00 AM,02:45,

Daylight savings did not start on the 12th, it started on the 13th. So only on the 13th did times between 2 and 3 AM not exist.

1 Like

Oh gotcha that makes sense. I'm assuming it was just a data entry error then. There are plenty of records during 01 and 03 on the 13th and this was the only record for 02. Is there a better way of combining the Date and Time fields just in the date filter so I don't have to pass through the truncate and mutate filters?

Not that I can think of.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.