How to parse date field into @timestamp

I'm trying to replace the @timestamp that's generated by logstash with the contents of an existing field in my data. I've been fighting with this all day, and I'm nowhere.

My data looks like this

{
"start_time" : "2017-11-09T21:15:51.906Z"
.
.
.
}

What I need to do is to grab the value in start_time and put that into the @timestamp field. So I started with the simple stuff, like using mutate to simply replace the value in @timestamp with start_time. Of course, that fails because @timestamp is a Timestamp and start_time is a string...

Digging further I read a bunch of stuff on S/O and elsewhere that said I should be able to use the filter:date to parse the date in my string and assign that to @timestamp.

So far nothing that I've tried using the date parsing stuff has worked.

I tried this:

date {
match => ["start_time", "yyyy-MM-ddTHH:mm:ss.SSSZ", "ISO8601" ]
target => @timestamp
}

This gives me an error before logstash even gets running:

Cannot load an invalid configuration {:reason=>"Illegal pattern
component: T"}

So then I tried:

date {
match => ["start_time", "%{TIMESTAMP_ISO8601}"]
target => "@timestamp"
}

I get the same response from this.

Finally, I tried:

date {
match => ["start_time", "ISO8601"]
target => "@timestamp"
}

This one at least allows logstash to start, but when an event is delivered, I get a resulting payload containing

"tags":["_dateparsefailure"]

No other stack trace or anything, just that...

Can someone point me in the right direction?

Edit: After more searching I tried the following (allowing for the T char in my date string):

date {
match => ["start_time", "yyyy-MM-dd'T'HH:mm:ss.SSSZ", "ISO8601"]
target => "@timestamp"
}

Unfortunately, while this starts up ok, I still get the _dateparsefailure tag.

You also need to quote the millisecond dot, like this:

date {
    match => ["start_time", "yyyy-MM-dd'T'HH:mm:ss'.'SSSZ"]
    target => "@timestamp"
}

Also, you do not need the second "ISO8601" pattern if you only have a specific datetime pattern.

That doesn't seem to change anything, unfortunately. I still get the _dateparsefaileure tag and the date is not properly set into @timestamp.

Just using ISO8601 doesn't work? Z is also a pattern character in these strings so you need 'Z'.

Using ISO8601 gives the same result. As does using my original string with the Z in single quotes.

That's surprising. What's in the Logstash log? The date filter logs any failures.

I was expecting that the logs would show something, but they don't. There's no error or stacktrace or anything from the date filter. The log contents of one complete run of my test is as follows:

[2017-11-13T10:07:44,154][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2017-11-13T10:07:44,183][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/
, :path=>"/"}
[2017-11-13T10:07:44,350][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<URI::HTTP:0x6919c553 URL:http://localhost:9200/>}
[2017-11-13T10:07:44,351][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2017-11-13T10:07:44,417][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"in
dex.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_map
ping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false
, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>fals
e}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}
}}}}}
[2017-11-13T10:07:44,431][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::HTTP:0x6ae35e90 URL:http:
//localhost:9200>]}
[2017-11-13T10:07:44,434][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipe
line.max_inflight"=>250}
[2017-11-13T10:07:45,399][INFO ][logstash.pipeline        ] Pipeline main started
[2017-11-13T10:07:45,509][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-11-13T10:08:00,346][INFO ][logstash.inputs.jdbc     ] (0.041000s) SELECT * from DS_AUDIT where START_TIME > TIMESTAMP '1970-01-01 00:00:00.000000 +00:00'
[2017-11-13T10:08:18,699][WARN ][logstash.runner          ] SIGINT received. Shutting down the agent.
[2017-11-13T10:08:18,715][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}

Strangely, the log doesn't show the output from the 1 record that gets extraced from the database, although it does show on stdout.

Okay. Please show get you get on stdout and your configuration so that we can attempt to reproduce.

The config looks like this:

 input {
    jdbc {
        jdbc_connection_string => "jdbc:oracle:thin:@192.168.56.101:1521/MRO"
        jdbc_user => "user"
        jdbc_password => "password"
        jdbc_driver_library => "/vagrant/chef-repo/cookbooks/logstash/files/ojdbc14-10.2.0.1.0.jar" 
        jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
        statement => "SELECT * from DS_AUDIT where START_TIME > :sql_last_value"
        schedule => "* * * * *"
        clean_run => "true"
        tracking_column => "START_TIME"
        tracking_column_type => "timestamp"
    }
}
filter {
  date {
    match => ["start_time", "yyyy-MM-dd'T'HH:mm:ss'.'SSS'Z'"]
    target => "@timestamp"
  }
}
output {
    elasticsearch {
        index => "sites"
        document_type => "site"
        hosts => "http://localhost:9200"
    }
}
output {
    stdout { codec => json_lines }
}

When I run my test, I get the following on stdout:

[2017-11-13T12:15:47,594][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2017-11-13T12:15:47,609][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2017-11-13T12:15:47,843][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<URI::HTTP:0x6090de40 URL:http://localhost:9200/>}
[2017-11-13T12:15:47,845][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2017-11-13T12:15:47,924][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2017-11-13T12:15:47,935][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::HTTP:0x70b3004c URL:http://localhost:9200>]}
[2017-11-13T12:15:47,942][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250}
[2017-11-13T12:15:49,436][INFO ][logstash.pipeline        ] Pipeline main started
[2017-11-13T12:15:49,569][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-11-13T12:16:00,579][INFO ][logstash.inputs.jdbc     ] (0.107000s) SELECT * from DS_AUDIT where START_TIME > TIMESTAMP '1970-01-01 00:00:00.000000 +00:00'
{"user_name":null,"end_time":null,"audit_status":null,"tags":["_dateparsefailure"],"start_time":"2017-11-09T21:15:51.906Z","site":null,"@timestamp":"2017-11-13T17:16:00.676Z","application":"Interactive","method_name":null,"@version":"1","details":null,"method_class":null,"status":null}
^C[2017-11-13T12:16:30,048][WARN ][logstash.runner          ] SIGINT received. Shutting down the agent.
[2017-11-13T12:16:30,060][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}

The line that I get after the JDBC SELECT statement in the stdout, does not show up in the log file. I'm assuming that that is simply due to the fact the I have a stdout output stage in my pipeline.

An interesting development. I tried the following:

filter {
  mutate {
    convert => { "start_time" => "string" }
  }
  date {
    match => ["start_time", "yyyy-MM-dd'T'HH:mm:ss'.'SSS'Z'"]
    timezone => "UCT"
    target => "@timestamp"
  }
}

And it works. My assumption here is that since start_time is coming from a JDBC call its type is not actually a String, so the date's matcher fails in a weird fashion. Converting it to a String first seems to do the trick.

3 Likes

My assumption here is that since start_time is coming from a JDBC call its type is not actually a String, so the date's matcher fails in a weird fashion. Converting it to a String first seems to do the trick.

Yes, that's exactly right. I filed this issue a while back:

So then would the recommended approach to this be to use the mutate filter to create a new string field and convert the @timestamp from that? Or would it be better to just do it in a Ruby filter instead?

Why not just rename the field since it already is a timestamp?

Unfortunately, I don't control the schema for elasticsearch. Other users of the system rely on the start_time field as well, so it needs to remain

So do you need to do anything at all then? Wasn't your end goal to have start_time as a date field?

No, the end goal was to have @timestamp have the same value as start_time. I think I have a solution that will work based on what we've discussed here. Thanks for all of your help...

Okay. How about copying start_time to @timestamp with the mutate filter's copy option?

The version of logstash that is currently approved for use here (v5.3.1) doesn't seem to have a copy method. I tried replace originally but that fails with a type mismatch.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.