`2020-08-27T09:54:56.185+0100 DEBUG [processor.timestamp] timestamp/timestamp.go:81 Test timestamp [26/Aug/2020:08:02:30 +0100] parsed as [2020-01-26 08:02:30 +0000 UTC]`
So as you see when timestamp processor tries to parse the datetime as per the defined layout, its not working as expected i.e.
26/Aug/2020:08:02:30 +0100 is parsed as 2020-01-26 08:02:30 +0000 UTC
Note the month is changed from Aug to Jan by the timestamp processor which is not expected.
I don't know if this is a known issue but i can't get it working with the current date format and using a different date format is out of question as we are expecting date in the specified format from several sources.
I would appreciate your help in find a solution to this problem.
Interesting issue I had to try some things with the Go date parser to understand it.
The thing here is that the Go date parser used by Beats uses numbers to identify what is what in the layout. Months are identified by the number 1. In string representation it is Jan, but in numeric representation it is 01. In your layout you are using 01 to parse the timezone, that is 01 in your test date. 01 interpreted as a month is January, what explains the date you see. The rest of the timezone (00) is ignored because zero has no meaning in these layouts. Actually, if you look at the parsed date, the timezone is also incorrect. Timezones are parsed with the number 7, or MST in the string representation.
Summarizing, you need to use -0700 to parse the timezone, so your layout needs to be 02/Jan/2006:15:04:05 -0700.
Thanks @jsoriano for the explanation. I've actually tried that earlier but for some reason it didn't worked. I've tried it again & found it to be working fine though to parses the targeted timestamp field to UTC even when the timezone was given as BST.
Its not a showstopper but would be good to understand the behaviour of the processor when timezone is explicitly provided in the config
2020-08-27T09:40:09.358+0100 DEBUG [processor.timestamp] timestamp/timestamp.go:81 Test timestamp [26/Aug/2020:08:02:30 +0100] parsed as [2020-08-26 07:02:30 +0000 UTC].
The timezone provided in the config is only used if the parsed timestamp doesn't contain timezone information. In your case the timestamps contain timezones, so you wouldn't need to provide it in the config.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.