Recently I started to process new type of logs that does not include the year, so they come with format "MMM dd HH:mm:ss".
According to the Logstash documentation (and this post), in cases like this, Logstash should automatically add the current year (2021), but instead it's adding default year 1970. This used to be an issue with older versions of Logstash, but I understand it was fixed, so I'm not sure why is happening now.
Is there a configuration that I can set or enable in Logstash so it can add current year when missing in logs, instead of 1970?
I am not. I was not aware that I had to use a filter for this. I though it was more like something automatic and transparent for the user, like an internal config.
I'll give it a try.
Question, can I reuse the same time field instead of adding a new one modified? Something like this:
date {
match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
target => "Date Timestamp"
}
In the mapping template that I'm using, the field it's being declared as a date field already. Not sure if we are trying to apply the date filter twice (template and input), or it does not really matter:
"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [Date Timestamp] of type [date] in document with id 'D-YU1XcB8LyTQ8P8ZL7J'. Preview of field's value: 'Feb 17 10:36:57'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"failed to parse date field [Feb 17 10:36:57] with format [strict_date_optional_time||epoch_millis]", "caused_by"=>{"type"=>"date_time_parse_exception", "reason"=>"Failed to parse with all enclosed parsers"
If you are sending elasticsearch a string and relying on it to parse the date, then I can well believe elasticsearch does not apply the same logic that logstash does to default the year to the current year / previous year depending on the current month and month in the date string, and instead defaults it to 1970 (which is the same thing logstash used to do).
If you use a date filter to parse the string and overwrite it by setting the target option you will get the default you want for the year. logstash will send the field to elasticsearch in the format Aaron mentioned (2021-01-01T19:11:10.000Z) and if you do not have a mapping to tell it otherwise the date detection in elasticsearch will apply the [strict_date_optional_time||epoch_millis] parsers.
@Badger, thank you for joining. I am a little confused after reading your words due to my lack of knowledge. Please correct me if I got it wrong. You're saying that I should use a target and create a new field like @aaron-nimocks suggested in the first place, instead of overwriting the same field I have?
Should I use this?:
date {
match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
target => "Date Timestamp"
}
and not this?:
date {
match => [ "Date Timestamp" , "MMM dd HH:mm:ss" ]
}
I was trying to leave field timestamp to register the time when the logs file are processed in ELK, and field Date Timestamp to register the moment of the event. This just in case I need to build a filter in the future.
I will try to overwrite the field Date Timestamp using the target option
"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [Date Timestamp] of type [date] in document with id 'J-5y1XcB8LyTQ8P8kixS'. Preview of field's value: '2021-02-18T16:43:07.000Z'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"failed to parse date field [2021-02-18T16:43:07.000Z] with format [MMM dd HH:mm:ss]", "caused_by"=>{"type"=>"date_time_parse_exception", "reason"=>"Text '2021-02-18T16:43:07.000Z' could not be parsed at index 0"}
In the above log you can see Logstash is passing date with format '2021-02-18T16:43:07.000Z'. The error came due to the format defined in the mapping template within Elasticsearch. Upon remove that format and try again, the error was gone. Now I have the following values for the time fields:
@timestamp Feb 24, 2021 @ 14:20:11.162
Date Timestamp Feb 18, @ 11:43:07.000
Even so, Elasticsearch indexed all entries in Feb, 2021 as expected!
Questions:
Even when the entry it's being indexed with the right year, why field Date Timestamp is not showing the year?
The main goal is to have events indexed properly but, is there a way to physically include/show the year in field Date Timestamp?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.