Missing year set to 1970 by Logstash

Hi,

  • Running ELK 7.10.1

Recently I started to process new type of logs that does not include the year, so they come with format "MMM dd HH:mm:ss".

According to the Logstash documentation (and this post), in cases like this, Logstash should automatically add the current year (2021), but instead it's adding default year 1970. This used to be an issue with older versions of Logstash, but I understand it was fixed, so I'm not sure why is happening now.

Is there a configuration that I can set or enable in Logstash so it can add current year when missing in logs, instead of 1970?

Thank you in advance

Are you using the date filter? The below gives me current year.

date {
 match => [ "date" , "MMM dd HH:mm:ss" ]
 target => "date-convert"
}
"date" => "JAN 01 12:11:10",
"date-convert" => 2021-01-01T19:11:10.000Z,

I am not. I was not aware that I had to use a filter for this. I though it was more like something automatic and transparent for the user, like an internal config.

I'll give it a try.

Question, can I reuse the same time field instead of adding a new one modified? Something like this:

 date {
     match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
     target => "Date Timestamp"
 }

Yes. Just remove the target line and it will overwrite the existing field. Target is just if you want to create a new field with the new value.

I tried it, but only the time (not the date) was affected.

 date {
     match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
 }

Field changed from Feb 16 19:38:32 to Feb 18, @ 06:43:07.000, but:

  • current year was not added
  • All entries were indexed to year 1970 again

In the mapping template that I'm using, the field it's being declared as a date field already. Not sure if we are trying to apply the date filter twice (template and input), or it does not really matter:

      "Date Timestamp": {
         "type": "date",
         "format": "MMM dd HH:mm:ss"
      },

Think I would remove that format from your mapping and try again. Your date is not in that format when it reaches the mapping.

Now it failed with error on every entry

"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [Date Timestamp] of type [date] in document with id 'D-YU1XcB8LyTQ8P8ZL7J'. Preview of field's value: 'Feb 17 10:36:57'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"failed to parse date field [Feb 17 10:36:57] with format [strict_date_optional_time||epoch_millis]", "caused_by"=>{"type"=>"date_time_parse_exception", "reason"=>"Failed to parse with all enclosed parsers"

Feb 17 10:36:57

Your data isn't leaving Logstash correctly. Should look like 2021-01-01T19:11:10.000Z, after the date filter.

Can you post your entire Logstash config?

If you are sending elasticsearch a string and relying on it to parse the date, then I can well believe elasticsearch does not apply the same logic that logstash does to default the year to the current year / previous year depending on the current month and month in the date string, and instead defaults it to 1970 (which is the same thing logstash used to do).

If you use a date filter to parse the string and overwrite it by setting the target option you will get the default you want for the year. logstash will send the field to elasticsearch in the format Aaron mentioned (2021-01-01T19:11:10.000Z) and if you do not have a mapping to tell it otherwise the date detection in elasticsearch will apply the [strict_date_optional_time||epoch_millis] parsers.

I agree with you. Processing and results are not as expected. Here is my config:

input {
    file {
        id => "PF_Logs"
        path => "/opt/pf_logs/*.log"
        mode => "read"
        start_position => "beginning"
        file_completed_action => "delete"
    }
}

filter {
    csv {
        columns => [
                "Date Timestamp",
                "Tracker ID",
                "Interface",
                "Interface Name",
                "Action",
                "IP Version",
                "Protocol ID",
                "Protocol",
                "SRC IP",
                "DST IP",
                "SRC Port",
                "DST Port",
                "Direction",
                "GeoIP",
                "Aliasname",
                "IP evaluated",
                "Feed Name",
                "Resolved Hostname",
                "Client Hostname",
                "ASN",
                "Duplicate event status"
        ]
        separator => ","
        skip_header => true
        skip_empty_rows => true
    }

     date {
         match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
     }
	 
    if [SRC IP] {
      geoip {
        source => "SRC IP"
        target => "geoS"
        tag_on_failure => []
      }
    }
    if [DST IP] {
      geoip {
        source => "DST IP"
        target => "geoD"
        tag_on_failure => []
      }
    }
}

output {
     elasticsearch {
        hosts => ["https://127.0.0.1:9200"]
        index => "pf_logs-%{+YYYY.MM}"
        user => "****************"
        password => "******************"
        ssl => true
        ssl_certificate_verification => true
        cacert => "**********"
    }
    stdout { codec => rubydebug }
}

@Badger, thank you for joining. I am a little confused after reading your words due to my lack of knowledge. Please correct me if I got it wrong. You're saying that I should use a target and create a new field like @aaron-nimocks suggested in the first place, instead of overwriting the same field I have?

Should I use this?:

 date {
     match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
     target => "Date Timestamp"
 }

and not this?:

 date {
     match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
 }

This will set the field [@timestamp], if you want to overwrite the [Date Timestamp] field then you must use the target option.

2 Likes

Got it! Thanks.

I was trying to leave field timestamp to register the time when the logs file are processed in ELK, and field Date Timestamp to register the moment of the event. This just in case I need to build a filter in the future.

I will try to overwrite the field Date Timestamp using the target option

1 Like

That did the job! :yum: :+1:

Initially I was getting an error in logs:

"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [Date Timestamp] of type [date] in document with id 'J-5y1XcB8LyTQ8P8kixS'. Preview of field's value: '2021-02-18T16:43:07.000Z'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"failed to parse date field [2021-02-18T16:43:07.000Z] with format [MMM dd HH:mm:ss]", "caused_by"=>{"type"=>"date_time_parse_exception", "reason"=>"Text '2021-02-18T16:43:07.000Z' could not be parsed at index 0"}

In the above log you can see Logstash is passing date with format '2021-02-18T16:43:07.000Z'. The error came due to the format defined in the mapping template within Elasticsearch. Upon remove that format and try again, the error was gone. Now I have the following values for the time fields:

@timestamp Feb 24, 2021 @ 14:20:11.162

Date Timestamp Feb 18, @ 11:43:07.000

Even so, Elasticsearch indexed all entries in Feb, 2021 as expected!

Questions:

  • Even when the entry it's being indexed with the right year, why field Date Timestamp is not showing the year?
  • The main goal is to have events indexed properly but, is there a way to physically include/show the year in field Date Timestamp?

That is a presentation issue. I do not run kibana but I believe you would need to change dateFormat in the settings.

1 Like

I'll take a look to that section.

Thanks a lot @Badger and @aaron-nimocks for your help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.