How should I configure the date filter?

espogian · October 1, 2016, 10:18am

Hi there. I have a .csv file in which my @timestamp field is formatted like this:

0043838407D89D6773491A20160918215104.8191+020000

How should I configure Logstash to make it ignore the first part and correctly represent the timestamp?

My configuration file is the following:

input {
  file {
    path => "path/to/my/file.csv"
    type => "test"
    start_position => "beginning"
  }
}

filter {
  csv {
    columns => [
	    "@timestamp",
#           ... a lot of other columns ...
    ]
    separator => ","
  }
  date {
    match => [ "@timestamp" , "???" ]
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch:9200"]
  }
  stdout { codec => rubydebug }
}

Thanks a lot

warkolm · October 1, 2016, 10:19am

Where is the actual date in that?

espogian · October 1, 2016, 10:20am

20160918215104.8191+020000

You see, it's like YYYYMMddHHmmss.ssssZ but I'm unsure it would work and I don't know how to ignore the initial text.
Thanks

magnusbaeck · October 1, 2016, 3:37pm

Use a grok filter to extract the pieces you're interested in.

espogian · October 1, 2016, 4:19pm

You mean using grok instead of csv? Or something else?

Thanks

magnusbaeck · October 1, 2016, 5:26pm

By all means keep using csv, but use a grok filter to extract the interesting parts of the first field. Something like

grok {
  match => {
    "@timestamp" => ".*(?<@timestamp>\d{14}\.\d+[+-]\d{4})\d\d$"
  }
}

would work. If the number of characters that comes before the timestamp is fixed it's possible to write a more exact pattern that's faster.

espogian · October 1, 2016, 6:52pm

Thanks for the kind reply, I'm starting to understand some nice features

I have removed the date filter and added your grok filter, loading the following configuration file:

input {
  file {
    path => "path/to/my/file.csv"
    type => "test"
    start_position => "beginning"
  }
}

filter {
  csv {
    columns => [
	    "@timestamp",
#           ... a lot of other columns ...
    ]
    separator => ","
  }
  grok {
    match => {
      "@timestamp" => ".*(?<@timestamp>\d+\.\d{4}[+-]\d{4})\d\d$"
    }
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch:9200"]
  }
  stdout { codec => rubydebug }
}

But unfortunately, now Logstash is returning me this error:

Settings: Default pipeline workers: 2
Pipeline aborted due to error {:exception=>"RegexpError", :backtrace=>["org/jruby/RubyRegexp.java:1434:in `initialize'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.3/lib/grok-pure.rb:127:in `compile'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-2.0.5/lib/logstash/filters/grok.rb:264:in `register'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-2.0.5/lib/logstash/filters/grok.rb:259:in `register'", "org/jruby/RubyHash.java:1342:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-2.0.5/lib/logstash/filters/grok.rb:255:in `register'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/pipeline.rb:182:in `start_workers'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/pipeline.rb:182:in `start_workers'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/pipeline.rb:136:in `run'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/agent.rb:491:in `start_pipeline'"], :level=>:error}
stopping pipeline {:id=>"main"}

Can't understand what's going on

magnusbaeck · October 1, 2016, 7:40pm

You still need the date filter, and when the @timestamp field contains something it can parse it'll actually work. But it seems the regexp library doesn't want to capture into a field with @ in it. So let's call it plain timestamp instead. This works:

filter {
  grok {
    match => {
      "@timestamp" => ".*(?<timestamp>\d{14}\.\d{4}[+-]\d{4})\d\d$"
    }
  }
  date {
    match => ["timestamp", "YYYYMMddHHmmss.SSSSZ"]
    remove_field => ["timestamp"]
  }
}

espogian · October 1, 2016, 8:20pm

Yes now the pipeline is working but there is still something wrong with the date:

Failed parsing date from field {:field=>"timestamp", :value=>"0043838407D89D6773491A20160918215104.8191+020000", :exception=>"Invalid format: \"0043838407D89D6773491A20160918215104.8191+...\" is malformed at \"D89D6773491A20160918215104.8191+...\"", :config_parsers=>"YYYYMMddHHmmss.SSSSZ", :config_locale=>"default=en", :level=>:warn}

magnusbaeck · October 1, 2016, 8:23pm

It works for me so you're doing something differently:

$ cat test.config
input { stdin {} }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => {
      "message" => ".*(?<timestamp>\d{14}\.\d{4}[+-]\d{4})\d\d$"
    }
  }
  date {
    match => ["timestamp", "YYYYMMddHHmmss.SSSSZ"]
    remove_field => ["timestamp"]
  }
}
$ echo '0043838407D89D6773491A20160918215104.8191+020000' | logstash -f test.config
Settings: Default pipeline workers: 8
Pipeline main started
{
       "message" => "0043838407D89D6773491A20160918215104.8191+020000",
      "@version" => "1",
    "@timestamp" => "2016-09-18T19:51:04.819Z",
          "host" => "bertie"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}

espogian · October 1, 2016, 8:38pm

Got it! I forgot to change the field name inside the grok filter.
Thanks for your kind support, this topic greatly improved my (little) knowledge

Topic		Replies	Views
Parsing date for @timestamp Logstash	3	267	March 18, 2020
Csv filter match date field Logstash	5	288	March 20, 2020
Csv and date filter Logstash	7	7071	July 6, 2017
Time Filter field name: @timestamp Logstash	2	2110	May 9, 2019
Timestamp => @timestamp Logstash	17	4044	June 14, 2017

How should I configure the date filter?

Related topics