Ingest Pipeline data processor adds hours

Hey there,

I have a problem with the data ingest processor for one logfile. It adds the timezone hours to my @timestamp. In my case +1h. I use the pipelines with data processor a lot, so I cannot figure out why this is happening with just this logfile.
I use filebeat to collect my logs then ship it to logstash and from there to elastic.

I try to ingest the following logline:
I2020-01-15T14:28:09,453 INFO [main] de.my.super.class: Started server
As you can see there is no timezone information. I add the timezone information with filebeat processors and use it in the pipeline.

Here you see the filebeat log event from debug log

filebeat event
{
  "@timestamp": "2020-01-15T13:28:17.945Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.4.2"
  },
  "input": {
    "type": "log"
  },
  "host": {
    "name": "mysuperserver.domain.local"
  },
  "agent": {
    "ephemeral_id": "93ab7bfd-7433-4873-8890-4b17c3b8221f",
    "hostname": "mysuperserver.domain.local",
    "id": "91e386db-0acf-43c1-a618-1338b922be63",
    "version": "7.4.2",
    "type": "filebeat"
  },
  "message": "I2020-01-15T14:28:09,452 INFO  [main] de.my.super.class: Started server",
  "fields": {
    "log_type": "log",
    "beats_output": "application"
  },
  "ecs": {
    "version": "1.1.0"
  },
  "event": {
    "timezone": "+01:00"
  },
  "log": {
    "offset": 5696,
    "file": {
      "path": "/opt/application/logs/application.log"
    }
  }
}

In this example the @timestamp field is "2020-01-15T15:28:09.452+01:00". But it have to be "2020-01-15T14:28:09.452+01:00" The event.created timestamp shows UTC with timezone +0100. So the timestamp from filebeat seems to be sent correctly.
If I test the pipeline with _simulate it is the exact same problem. What can I do to get the correct timestamp?

My Configuration:

Filebeat Config
filebeat.config.inputs:
    enabled: true
    path: /etc/filebeat/conf.d/*.yml

filebeat.config.modules:
    enabled: true
    path: /etc/filebeat/modules.d/*.yml

setup.kibana.host: "http://kibana.local"

processors:
- add_locale: ~

output.logstash:
    hosts: ["logstash.local:5044"]
    ssl.certificate_authorities: ["/etc/ssl/logstash.crt"]
    loadbalance: true

name: mysuperserver.domain.local
Filebeat ConfigII
- type: log
  paths:
    - /opt/application/logs/application.log
  fields:
    log_type: log
    beats_output: application
  multiline:
    pattern: '^I[[:digit:]][[:digit:]][[:digit:]][[:digit:]]-[[:digit:]][[:digit:]]-[[:digit:]][[:digit:]]'
    negate: true
    match: after
Logstash Config
output {
    elasticsearch {
      hosts => ["http://elastic.local:9200"]
      manage_template => false
      index => "filebeat-7.4.2-super-index"
      pipeline => "application-superlog"
      ilm_enabled => false
    }
}
My Pipeline
{
  "description": "Pipeline for parsing application log.",
  "processors": [
      {
          "grok": {
              "field": "message",
              "patterns": [
                  "(?m)I%{TIMESTAMP_ISO8601:application.timestamp} +%{LOGLEVEL:log.level} +\\[%{DATA:process.thread.name}\\] +%{DATA:log.logger}: %{GREEDYDATA:application.message}"
              ],
              "ignore_missing": true
          }
      },
      {
          "remove": {
              "field": "message"
          }
      },
      {
          "rename": {
              "field": "@timestamp",
              "target_field": "event.created"
          }
      },
      {
          "rename": {
              "field": "application.message",
              "target_field": "message"
          }
      },
      {
          "date": {
              "field": "application.timestamp",
              "target_field": "@timestamp",
              "formats": [
                  "ISO8601"
              ],
              "timezone": "{{event.timezone}}",
              "ignore_failure": true
          }
      },
      {
          "remove": {
              "field": "application.timestamp",
              "ignore_failure": true
          }
      }
  ],
  "on_failure": [
      {
          "set": {
              "field": "error.message",
              "value": "{{ _ingest.on_failure_message }}"
          }
      }
   ]
 }

The Elastic output:

Elastic Document
{
  "_index" : "filebeat-7.4.2-super-index-000001",
  "_type" : "_doc",
  "_id" : "q4hjqW8B-cbAsJcK-aJp",
  "_score" : 1.0,
  "_source" : {
    "agent" : {
      "hostname" : "mysuperserver.domain.local",
      "id" : "91e386db-0acf-43c1-a618-1338b922be63",
      "type" : "filebeat",
      "ephemeral_id" : "93ab7bfd-7433-4873-8890-4b17c3b8221f",
      "version" : "7.4.2"
    },
    "process" : {
      "thread" : {
        "name" : "main"
      }
    },
    "log" : {
      "file" : {
        "path" : "/opt/application/logs/application.log"
      },
      "offset" : 4334,
      "level" : "INFO",
      "logger" : "de.my.super.class"
    },
    "microapps" : { },
    "message" : "Started server",
    "tags" : [
      "beats_input_codec_plain_applied"
    ],
    "input" : {
      "type" : "log"
    },
    "environment" : "cloud",
    "@timestamp" : "2020-01-15T15:28:09.452+01:00",
    "stage" : "dev",
    "ecs" : {
      "version" : "1.1.0"
    },
    "@version" : "1",
    "host" : {
      "name" : "mysuperserver.domain.local"
    },
    "event" : {
      "timezone" : "+01:00",
      "created" : "2020-01-15T13:28:17.945Z"
    },
    "fields" : {
      "log_type" : "log",
      "beats_output" : "application"
    }
  }
}

//Edit:
I use 7.4.2 for filebeat, logstash and elasticsearch.

The @timestamp is in UTC time, what's your kibana config ?

Is UTC time different from your local time ?

see [Date field saved as default timezone

Logfiles are CET (+0100). I know that elastic stores its event in utc thats the reason i give a timezone in my pipeline config so that it can parse the date correctly. So the timestamp should be "2020-01-15T13:28:09.452+01:00". Thats what I expect.

It seems the ISO8601 format is not parsed correctly. When I use mapping date format "strict_date_hour_minute_second_millis" I get the error

"Text '2020-01-15T14:28:09,453' could not be parsed at index 19"

It is because of the comma except a point before the milliseconds. If the timestamp of the logfile is changed to 2020-01-15T14:28:09.453 it will be parsed correctly with the mapping date format.

I try to set the Date Format to yyyy-MM-dd'T'HH:mm:ss.SSS but I can't figure out how to escape the single quote correctly in the pipeline.

You can reproduce this by simulating:

{
  "pipeline": {
    "processors": [
      {
        "date": {
          "field": "timestamp",
          "timezone": "+0100",
          "formats": [ "ISO8601"]
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "timestamp": "2020-01-15T14:28:09.452"
      }
    }
    ]
}

//Edit:
I opend an issue on github and it seems to be bug:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.