Silent error on date parsing

Logstash is silently dropping my logs after implementing a small modification in my grok filter.

The original filter looks like:

filter {
  if [internal][logtype] == "mycustomtype" {
    grok {
      match => {
        "message" => [
          '^\[%{DATA:dummy}\]  %{LOGLEVEL:my.loglevel} \{%{DATA:my.source}\} - timestamp = %{DATA:timestamp}, status = %{DATA:my.status}, origin = %{DATA:my.origin}, extra_fields$'
    mutate {
      remove_field => [ "dummy" ]
    date {
        match => [ "timestamp" , "YYYY-MM-dd'T'HH:mm:ss.SSS" ] #2022-04-26T08:27:33.279
        target => "@timestamp"
        remove_field => [ "timestamp" ]
        timezone => "Europe/Madrid"

The modification consists on swapping the order of the status and origin fields.
The new filter has been thoroughly tested on the debugger, but is not working "on the field".

After some debugging, I could find that:

  • The original filter does not drop the log, even if the match fails
  • Any slight modification of the filter results in a drop
  • I can get a matched log by using a filter for the fields up to the timestamp, none after, like ^\[%{DATA:dummy}\] %{LOGLEVEL:my.loglevel} \{%{DATA:my.source}\} - timestamp = %{DATA:timestamp}
  • I can avoid the drop by commenting out the timezone in the date section, and get logs with and without grok match

Could anyone provide a hint on this weird behavior?

In particular:

  • Why the log is being dropped and the error silenced
  • Why the original filter is working?


Can you share your entire pipeline and a sample message?

Which filter you changed? The grok? the mutate? the Date filter?

If commenting the timezone option in the date filter fix your issue, this could mean that your logs are not being dropped, but being ingested with a wrong date and you may not be using a wide enough date range in the time filter to see it.

Can you add an extra output to write your logs to a file and see if they show up there?

Thanks @leandrojmp, this was the problem.
Well, this and my misunderstanding on how the timezone is working.

So, examining the different timestamps I now see that:

  • default @timestamp is shown in my local time (UTC+1), e.g. Nov 30, 2022 @ 21:22:57.480
  • the timestamp I'm parsing is already in UTC, so 2022-11-30T20:22:57.472
  • when adding the timezone I'm creating a new field which does the opposite of what I though. Instead of showing me the parsed message in my timezone, it shows the parsed ts in UTC assuming the original message was in my local time, so 2022-11-30T19:22:57.472Z

I was watching for the latest logs, and not realizing they were appearing with a ts two hours earlier. This certainly breaks causality! :smiley:

Thanks for your help!

P.S. It is still a mystery to me how the original filter was producing correct timestamps, but at this point is something I can live with.

Timezones can be confusing sometimes.

If your timestamp is already in UTC, you should not use the timezone option in the date filter, this option is to inform the date filter that the date string that it will parse is from a different timezone from UTC.

If the date string already has a timezone offset, you also should not use the timezone option.

Elasticsearch only store dates in UTC and Kibana always convert back from UTC to your local timezone.

In Kibana you can check it looking at the json of the document, in the table view you will have the date converted to your local timezone, but in the json view you will have the value without convertion, which will be in UTC.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.