Logstash uses current year for timestamp without year and shows events with future dates

Hi Community !!

I have logs wherein there's no year mentioned, and while parsing, logstash adds current year and makes the event appear in future dates.

i checked on forums for a similar issue but unfortunately, there wasn't any solution for it. Based on the discussion, the date filter should automatically put the correct year but its not happening in my case. Below is a sample of log timestamp.

Tue Oct 20 11:04:30.996 data: zone1 data: zone2 this is log date from Oct 2020.

And the output i get is

              "new" => 2021-10-19T08:04:30.996Z,
          "logtime" => "Tue Oct 20 11:04:30.996",
             "path" => "/home/zone.log",
          "message" => "Tue Oct 20 11:04:30.996 data: zone1 data: zone2",

grok filter is

grok
		{
		match => { "message" => "(?<logtime>%{DAY} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}) ........" }
		}

		date
		{
		match =>  [ "logtime", "EEE MMM dd HH:mm:ss.SSS"  ] 
		target => "new"
		}

There's no dateparsefailure but the timestamp goes into future. Please advise on this.

@theirfan Look like this is a known issue with the date filter: Problem with inferred year in edge cases

One workaround would be to hard code the year while processing older data.

Here's test config:

input {
    generator  {
        message => 'Tue Oct 20 11:04:30.996 data: zone1 data: zone2'
        count => 2
    }
}

filter {

    grok {
        match => { "message" => "(?<logtime>%{DAY} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}) ........" }
    }

    mutate {
        replace => ["logtime", "%{logtime} 2020"]
    }

    date {
        match =>  [ "logtime", "EEE MMM dd HH:mm:ss.SSS YYYY"  ]
        target => "new"
    }
}

output {
    stdout {
        codec => rubydebug
    }
}

Results:

{
      "sequence" => 1,
       "logtime" => "Tue Oct 20 11:04:30.996 2020",
          "host" => "ivv-baseline-image",
    "@timestamp" => 2021-07-07T18:38:24.685Z,
      "@version" => "1",
       "message" => "Tue Oct 20 11:04:30.996 data: zone1 data: zone2",
           "new" => 2020-10-20T11:04:30.996Z
}
{
      "sequence" => 0,
       "logtime" => "Tue Oct 20 11:04:30.996 2020",
          "host" => "ivv-baseline-image",
    "@timestamp" => 2021-07-07T18:38:24.664Z,
      "@version" => "1",
       "message" => "Tue Oct 20 11:04:30.996 data: zone1 data: zone2",
           "new" => 2020-10-20T11:04:30.996Z
}

Let me know if this helps.

@ritchierich

Thanks for your reply. Your idea is good and works. I have had a run at this but the issue with this approach is that I cannot select the correct datetime to make vis. If this is applied, then I have two time fields, "@timestamp" and "new" and while applying filters, I can only use on date field as reference to my data. Correct me if I am getting here wrong, please!

I tried a ruby code to get rid of any logtime which is of past 3 months but the issue I am facing is that i still get wrong date parsed. Like Feb of 2020 has occurrence again in Feb of 2021 with one day ahead.

I have a series of logs in this pattern. Please advise.

I was going off the example you shared. If you're satisfied with the workaround you can target @timestamp like below:

    date {
        match =>  [ "logtime", "EEE MMM dd HH:mm:ss.SSS YYYY"  ]
        target => "@timestamp"
    }

You're going to need to add some logic around the month to identify 2020 vs 2021 or review the data before ingesting it.

@ritchierich thanks for your answer. I believe the best solution would be to make a ruby code.

I was able to cancel any future events with Ruby code with help, to avoid dates in future.

The catch here is to check for a year which doesn't exists!!