Elasticsearch mapping with timezone

I have some JSON data that I am parsing with Filebeat and feeding into elasticsearch. There're some timestamps like this
"joinTime": "2021-04-23 10:48:10+08:00"
and I used the following in mapping to store it
"joinTime": { "type": "date", "format": "yyyy-MM-dd HH:mm:ssZZZZZ" }
I tried using the pattern from this post
but the timestamp that entered elasticsearch, as I checked from kibana, ended up to be 8 hours later than it should be. Is there any ways I could restore the proper time?

What does this mean? Have you checked the value with some range query? If so, could you please post it together with your expectations here? Or have you looked at the stored long value e.g. via doc_values on the date field? In this case please also post that value together with your expectation here. With timezones its easy to confuse things so I'd like to get the basics clear before jumping to conclusions here.

1 Like

Dear Christoph: sorry for the late reply! I've been away from work for a while.
For an example, I looked at the earliest entry of all records,
"sample_time": "2021-04-22 19:42:38"
But instead when I check it out in Kibana, the sample time is 2021-04-23 03:42:38

For this, I am not sure what do you mean, could you please elaborate?
Thank you very much in advance.

What I mean is that Elasticsearch internally stores dates as numeric long values representing the time as milliseconds since epoch in UTC time zone. You can use the "doc_values" option to check the stored value, however it will be formated according to the fields datatype format. In your case it will probably be "2021-04-23 02:48:10Z" because thats the date printed according to your format in UTC. The "_source" field of each document contains the original date string.
I don't know how you look at these values in Kibana (that was my original question) but from the Elasticsearch side this seems working fine.

1 Like

What I mean is all the timestamps shown in Kibana are 8-hours late. Say, the time range of all the data is supposed to be from 7pm 22/4 to 11am 23/4, but as I made a plot of count against time in Kibana, the time range shown in Kibana became 3am 23/4 to 7pm 23/4.

This is a big problem for me, because in Kibana, when you set time range you can only view data up to now(but not in the future), and as I am monitoring some log data using this set up, the latest I can get is logs from 8 hrs ago.

And I just did what you said and check the doc_value. Odd enough, the time stored in field are correct.

So I guess what happened is that when Kibana displayed these data, 8 hours has been added, due to the timezone mapping. (I suppose this is the source of problem because I usually just use timestamp without any timezone, and I have never seen something like this before). How do I get Kibana to display the correct time in this case?

I found a workaround here.
Kibana timestamp in browser local time, but incoming logs UTC - Elastic Stack / Kibana - Discuss the Elastic Stack

Really thanks for the time and attention. :slight_smile:

I'm curious and looked at the issue to understand what the problem and solution was to help the next user with this problem better. Is it more or less right to say that the dates were stored correctly (and uniformly in UTC) in Elasticsearch but you needed to change a display related setting in Kibana to get the UI reflect that you want data to be displayed in +08:00? That would probably mean all other data that was entered e.g. with a UTC time zone is then displayed according to that shift as well?

Dear Christoph,
I thought I got over the issue but then there're some other drawbacks from setting the timezone as well...
The source of problem is there're several timestamps in each of my data entry corresponds to some events, some of them come with the timezone at the end "+08:00", some doesn't.
And then what happened is that, when Kibana took my browser's default timezone setting, it assumed the ones with timezone stated is in UTC+8, while the one without timezone is UTC+0. That means, it kinda divided the data into two time zones, which is why no matter how I tried toggling with time zone setting I couldn't get things straight.

In short, for anyone else trying to import multiple time data: if they're in the same timezone, DO NOT ADD ANY TIMEZONE IN TIMESTAMP!!!!!! And if there's timezone coming in from the data, REMOVE IT!!! It will come back to bite you.

I disagree, but I see why it take a bit of effort to handle that kind of data. Since we store everything in UTC in Elasticsearch internally, it is useful to either convert your data to a uniform tz in the client or provide the tz information in ALL timestamps so we can map them to the correct point in time in UTC. Then the relations between events should be preserved correctly. If you are abolutely sure you will never mix timezones it might be okay to just pretend they are UTC, but keep in mind that 2021-04-23 10:48:10+08:00 != 2021-04-23 10:48:10

1 Like

I am not sure if I can set time zone for every one of the timestamps, because in my case besides using timestamps from original data, and the timestamp(time of ingestion) from filebeat. I remember they don't come with time zone by default... Do you think I could set that?

This unfortunately is outside of my area of expertise, but I would check in the filebeat part of the forum perhaps. If you have multiple data sources that report dates with and without tz info some sort of normalization before entering the data into ES is probably necessary, otherwise you are mixing data with different reference points in time.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.