How Elasticsearch gets timezone data?

In Brazil we are having a huge problem related with daylight saving time, because the start date was delayed by the President.

Apparently, my cluster thinks that we are in -0200 at America/Sao_Paulo, but actually we are in -0300.

How can I update timezone data? The linux server timezone data is correct

Hi @arthurbailao

this is certainly an interesting situation.

Elasticsearch uses Joda-Time for its time zone calculations. The latest ES version (6.4.2) uses Joda Time 2.10, which according to the information at the release tag in their open source repo uses version 2018d of the Iana time zone database.

According to this news I found about the goverments plans to change the DST change date in Brazil, it is not even completely clear yet which date the actual change happens. Linux distributions might update their time zone database more frequently to reflect changes occuring with such a short notice, but for Elasticsearch to include this updated information we will need to incorporate a new version of Joda. Currently 2.10 (which we use) is the latest released version though.

So, unfortunately I don't have a good suggestion other than manually patching the joda-time jar file with updated time zone data, which I'm not sure is a good option either and I also don't know how to do in a good way.

Sounds like a good example of politics screwing around with sound engineering.

@arthurbailao
btw, which version of Elasticsearch are you using? Maybe an update to a more recent version already comes with updates tzdata. When I understand https://time.is/time_zone_news/new_start_date_for_dst_in_brazil correctly, the decision to have the 4 November as the DST change date was already made last year, so I suspect that this is reflected in current versions of Joda time, but maybe your version still uses an older release?

Hi @cbuescher, thank you for your answer!

Yes, unfortunately I'm using an older version of Elasticsearch 2.

Do you think that updating Java timezone data using TZUpdater could solve this issue? Or maybe Joda Time will just ignore it and use his own version of timezone database?

Look at timezone data versions for JRE: at version tzdata2018c the problem is solved.

My guess is the joda library uses its own tzdata resources, but I haven't got enough knowledge about its internal workings at the moment. Joda 2.10, which we use since Elasticsearch 6.4, seem to use a newer tzdata version.

Hi everyone. I'm considering manually upgrading the joda-time JAR.
Check Joda's changelog here: https://www.joda.org/joda-time/changes-report.html

For the Brazilian DST this year we need at least 2018a (that's because the proposed second change of DST, for ENEM/national exams, was somehow "undone", or, never officially published, if it were, we would need 2018f at least).
This means joda 2.10.x solves it for us.

ES (my version, 6.3.2) has /usr/share/elasticsearch/lib/joda-time-2.9.9.jar whose SHA1 sum matches that of Joda-time's release page (https://github.com/JodaOrg/joda-time/releases) - so ES seems to be using it with no changes. But... I may remember from 1.x days that joda-time needed to be hacked into by ES for performance reasons... is it still true? Are there joda classes in other jars?

Also, how does https://github.com/elastic/elasticsearch/issues/27330 affect this? We already updated JVM with no results. (Maybe this another reason to migrate to java.time, our JVM update would have solved it).

Anyway I am gonna go ahead and give it a spin, report in a bit.

Note that 5.6.13 and 6.4.3 will have the updated database.

I may remember from 1.x days that joda-time needed to be hacked into by ES for performance reasons... is it still true?

No, we no longer hack joda.

Maybe this another reason to migrate to java.time, our JVM update would have solved it

Yes, that will be an additional benefit of using java time.

Great, thanks.
I just updated joda manually on a single node cluster and it seems to work.
Specifically, an aggregation like

"date_histogram": {
  "field": "timestamp",
  "format": "MM-yyyy",
  "time_zone": "America/Sao_Paulo",
  "interval": "1M",
  "offset":    "0",
  "order": {
    "_key": "asc"
  },
  "keyed": false,
  "min_doc_count": 0
}

would fail on our dataset due to the DST error in the keying between October and November 2018 (would return 31/Oct 23:00) instead of (1/Nov 00:00), effectively "eating up" november's data.
Updating joda fixed it.

I understand newer minor versions have the fixed jar, thanks for that.
I've been handing this DST fuckup since last week and there's all kinds of tzdata-embedding everywhere, at least here we can just swap JARs like good old times.
Anyway a 6.3.3 would be awesome.

@rjernst considering the successful upgrade of joda-time JAR by @rpardini, would you recommend performing the same operation for Elasticsearch 2.3.4?

@arthurbailao ES 2.3.4 uses pristine joda-time 2.9.4.
ES 2.4.6 uses joda-time 2.9.5, also pristine.
My testing indicates that both can be updated to use joda-time 2.10.1 with no problems.
Just remove the Joda .jar from lib directory (on Ubuntu, /usr/share/elasticsearch/lib) and add the new one and restart ES.
Good luck.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.