"_dateparsefailure"

Hi there,
i am trying to parse a .CSV file with a pretty horrible date format.
I am using the Date filter:

    date {
        match => [ "HelperDate", "dMMyyyy", "ddMMyyyy" ]
      }
	

Some example data:

6112019 -> "_dateparsefailure"
1122019 -> "_dateparsefailure"
8032019 -> "_dateparsefailure"
29112019 -> works
26112019 -> works
28022019 -> works

I think the problem only occurs on single-digit days. I thought "d" would do the job.

Maybe somebody knows how to fix the problem.
I was thinking about a possible workaround: Adding a 0 before the whole field, when the field length is only 7 digits. Unfortunately I am still a beginner, so I don't know if this is even possible.

Thank you for helping me out.
If more information is needed, please let me know.

Kind Regards
Marvin

I simulated your date filter and the dates with single digit also fails for me.

You can implemente your workaround, add a 0 if the date has 7 digits, using a conditional and some mutate filters.

The following config will do:

if [HelperDate] =~ /^[0-9]{7}$/ {
    mutate {
        rename => { "HelperDate" => "oldHelperDate" }
    }
    mutate {
        add_field => { "HelperDate" => "0%{oldHelperDate}" }
    }
}

The if condition will check your field HelperDate against a regex that will test if it is a sequence of 7 digits, if this is true, the first mutate will rename your field and the second one will recreate the HelperDate field adding a 0 in front of the oldHelperDate field.

If you want you can remove this oldHelperDate field with a new mutate:

mutate {
    remove_field => ["oldHelperDate"]
}

Simulating this config for the values 29112019 and 61120219 I've got the following output.

{
    "oldHelperDate" => "6112019",
       "@timestamp" => 2019-11-06T03:00:00.000Z,
         "@version" => "1",
       "HelperDate" => "06112019",
             "host" => "elk"
}
{
    "@timestamp" => 2019-11-29T03:00:00.000Z,
      "@version" => "1",
    "HelperDate" => "29112019",
          "host" => "elk"
}

Since your date field does not have a time, logstash will use the time 00:00:00.000Z and will convert it to UTC according to the servers time zone.

In my case I'm on UTC-3, so the time of my date will be 03:00:00.000Z.

1 Like

See here. Even for "d" or "H" the parser will consume two digits if they are available, and there are no months with 61 days, so it fails to parse 6112019.

1 Like

Thank you so much. Works like a charm! You made my day!