Hello,
Currently in my pipeline I'm parsing out loglines, each log line begins with a timestamp such as: "2020-01-19T00:00:08.127+0800" to a field called logtimestamp, and then later in my pipeline update this to the @timestamp field.
For example..
date {
match => ["logTimestamp", "ISO8601"]
remove_field => [ "logTimestamp" ]
}
This works fine, but feels very slow. Looking at the stats under the pipeline, it is averaging between 0.05 - 0.06 ms/event on the date statement. To put in comparison, the dissect of my log line event takes on average 0.03 ms/event.
I've started looking at the code in timestamp.rb and it looks like its taking this timestamp, converting it to milliseconds and creating a logstash timestamp object.
If I run something like this it appears to drop me down to 0.03-0.04 ms/event:
ruby { code => ' event.set("@timestamp", LogStash::Timestamp.parse_iso8601(event.get("logtimestamp"))) ' }
A second note is that in my case the log files come from different timezones, but I want to have them all saved as UTC, so timezone data is always dropped... so for instance:
event.set("@timestamp", LogStash::Timestamp.parse_iso8601(event.get("logtimestamp")[0..22]))
event.remove("logtimestamp")