@timestamp not overwriten by date filter

I'm trying to back fill catalina.out from Tomcat so I need to overwrite timestamp with the one from the logs.
Example message is:

[INFO] 2016-04-22 17:50:20,769 [Blah1] [Blah2] ...

My filter is:

filter {
  if [type] == "catalina" {
    grok {
      match => [ "message", "%{(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3}:sourcestamp}" ]
    }
    date {
      match => [ "sourcestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
      timezone => "UTC"
    }
  }
}

I'm not getting any error messages in logstash logs, but timestamps on entries still don't match the timestamps from the logs. Any ideas?

You need to add

target => "@timestamp" in your date filter

You need to add

target => "@timestamp" in your date filter

No, that's never necessary. It's the default.

I'm not getting any error messages in logstash logs, but timestamps on entries still don't match the timestamps from the logs. Any ideas?

What does a processed message look like? Show the result of a stdout { codec => rubydebug } output.

I'm seeing grokparsefailure in the rubydebug log:

{
"@timestamp" => "2016-04-25T17:37:56.048Z",
"message" => "[INFO] 2016-04-25 17:37:17,528 [Blah1] [Blah2]",
"@version" => "1",
"beat" => {
"hostname" => "host.name.com",
"name" => "host.name.com"
},
"count" => 1,
"fields" => nil,
"input_type" => "log",
"offset" => 331994238,
"source" => "/usr/share/tomcat/logs/catalina.out",
"type" => "catalina",
"tags" => [
[0] "_grokparsefailure"
]
}

I verified the regex using this page so I'm not sure how to continue debugging this. How else can I figure out what's causing the grok parse failure?

Only now am I reading your previously posted config properly:

  match => [ "message", "%{(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3}:sourcestamp}" ]

I've never seen this syntax used before and I can only assume it doesn't work. The %{X:Y} notation doesn't allow X to be a regular expression, only the name of a grok pattern. Either use a named grok pattern (TIMESTAMP_ISO8601 probably works in this case) or skip the grok syntax altogether:

  match => [ "message", "(?<sourcestamp>(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3}" ]

I tried a named pattern first, unfortunately it fails with the following error:
{:timestamp=>"2016-04-25T13:59:30.262000-0700", :message=>"Failed parsing date from field", :field=>"sourcestamp", :value=>"2016-04-25T00:00:00Z", :exception=>"Invalid format: \"2016-04-25T00:00:00Z\" is malformed at \"T00:00:00Z\"", :config_parsers=>"yyyy-MM-dd HH:mm:ss,SSS", :config_locale=>"default=en_US", :level=>:warn}

Search seems to suggest it has to do with Java date library being very strict about the format.

I also tried switching the format like you suggested, but I'm still getting the grok parse failures:

{
"@timestamp" => "2016-04-27T15:23:15.340Z",
"message" => "2016-04-27 15:23:15,340 [INFO] elasticsearch:63 - GET http://host.name.com/path]",
"@version" => "1",
"file" => "/var/log/mongo-connector/mongo-connector.log",
"host" => "host.name.com",
"offset" => "34161681",
"type" => "catalina",
"sourcestamp" => "2016-04-27 15:23:15,340",
"tags" => [
[0] "_grokparsefailure"
]
}

Example message above does have the stamp before the [INFO] tag, but I'm assuming the pattern should still be able to match it.

Please show your filter configuration.

I'll just show the entire config file just in case combining lines using timestamps is affecting the filter.

input {
  beats {
    port => 5044
    codec => multiline {
        pattern => "(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3}"
        negate => true
        what => "previous"
    }
  }
}

filter {
  if [type] == "catalina" {
    grok {
      match => [ "message", "(?<sourcestamp>(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3})" ]
    }
    date {
      match => [ "sourcestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
      timezone => "UTC"
    }
  }
}

Okay. The whole thing is suspicious since it obviously extracts the sourcestamp field, but the grok filter either extracts all the fields or none of them. Secondly, the grok and date filter above work just fine:

$ cat test.config
input { stdin { type => "catalina" } }
output { stdout { codec => rubydebug } }
filter {
  if [type] == "catalina" {
    grok {
      match => [ "message", "(?<sourcestamp>(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3})" ]
    }
    date {
      match => [ "sourcestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
      timezone => "UTC"
    }
  }
}
$ echo '2016-04-27 15:23:15,340 [INFO] elasticsearch:63 - GET http://host.name.com/path]' | /opt/logstash/bin/logstash -f test.config
Settings: Default pipeline workers: 2
Pipeline main started
{
        "message" => "2016-04-27 15:23:15,340 [INFO] elasticsearch:63 - GET http://host.name.com/path]",
       "@version" => "1",
     "@timestamp" => "2016-04-27T15:23:15.340Z",
           "type" => "catalina",
           "host" => "hallonet",
    "sourcestamp" => "2016-04-27 15:23:15,340"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}

I suspect there's another grok filter lurking in your configuration. Perhaps another file in /etc/logstash/conf.d that you've forgotten about?

I removed every other config to confirm your theory and I think something else is happening. Timestamps in logstash.stdout do match those from the logs, but the ones in Kibana still do not.

Is it possible the timestamps lose sync again at the Redis layer? I'm not using any filters there. Output to Redis is below:

output {
  redis {
    data_type => "list"
    key => "logstash:cache:dev"
    host => [ "redis.hostname.com" ]
  }
}

Redis queue is then picked up by another server and dumped into ElasticSearch with the following config files:
In:

input {
  redis {
    host => "redis.hostname.com"
    data_type => "list"
    key => "logstash:cache:dev"
  }
}

Out:

output {
  elasticsearch {
    protocol => "http"
    host => [ "eshost.hostname.com" ]
  }
}

Do I need to additionally preserve timestamp when using Redis between Logstash and ElasticSearch?

I'm sorry, this was a case of massive PEBCAK. I had some of the filebeats configured to output directly to ES so those logs were showing up in Kibana without any processing.