Optimizing Datefield processing by moving from date{} to ruby code


Currently in my pipeline I'm parsing out loglines, each log line begins with a timestamp such as: "2020-01-19T00:00:08.127+0800" to a field called logtimestamp, and then later in my pipeline update this to the @timestamp field.

For example..

date {
match => ["logTimestamp", "ISO8601"]
remove_field => [ "logTimestamp" ]

This works fine, but feels very slow. Looking at the stats under the pipeline, it is averaging between 0.05 - 0.06 ms/event on the date statement. To put in comparison, the dissect of my log line event takes on average 0.03 ms/event.

I've started looking at the code in timestamp.rb and it looks like its taking this timestamp, converting it to milliseconds and creating a logstash timestamp object.

If I run something like this it appears to drop me down to 0.03-0.04 ms/event:

ruby { code => ' event.set("@timestamp", LogStash::Timestamp.parse_iso8601(event.get("logtimestamp"))) ' }

A second note is that in my case the log files come from different timezones, but I want to have them all saved as UTC, so timezone data is always dropped... so for instance:

event.set("@timestamp", LogStash::Timestamp.parse_iso8601(event.get("logtimestamp")[0..22]))

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.