Timezone affected dates

I have my_datetime and time_zone fields (my_datetime is a datetime UTC
field and time_zone is the string version of the timezone, eg
"Europe/Paris".

I currently use the AT TIME ZONE function in postgres to transform
dates on the fly in my where clause.

Eg WHERE my_datetime AT TIME ZONE 'Europe/Paris' > now

Can someone let me know what the equivalent to this WHERE clause is in
elasticsearch. I expect it is something to do with the *Date Histogram Facet

  • but I'm not too sure.

Thanks

--

On 1/19/2013 11:28 AM, amnesia7 wrote:

I have my_datetime and time_zone fields (my_datetime is a datetime
UTC field and time_zone is the string version of the timezone, eg
"Europe/Paris".

"datetime" is not a Postgres data type it is only a set of data types.

I currently use the AT TIME ZONE function in postgres to transform
dates on the fly in my where clause.

Since AT TIME ZONE takes a timestamp (or a time), I'll assume your
my_datetime column is what Postgres calls a timestamp.
In Postgres a timestamp can be either with or without a TZ, luckily "For
timestamp with time zone, the internally stored value is always in UTC"
[ibid]. So with or without is actually just a note on how to convert
from ISO-8601 fields to the internal microsecond count value. This is
great, because this is similar to how ES does it.

Hopefully, If you were using timestamp with timezone, (1) you always
specified a TZ, (2) the server was running on UTC/GMT TZ
(to use as the default) or you were using Java JDBC which always sends
a java.util.Date which I'll assume means that internal long value from
the java Date is understood as an offset from 0 at 1970-01-01
00:00:00.000 UTC and used directly.

I see the questions as:

  1. How to store a value as a some kind of time stamp.
  2. How to compare now to a time stamp?

Question 1: Storing

There is a Date type in Elasticsearch which is defined as:
"a special type which maps to JSON string type. It follows a specific
format that can be explicitly set. All dates are UTC. Internally, a date
maps to a number type long, with the added parsing stage from string to
long and from long to string.
[...]
The date type will also accept a long number representing UTC
milliseconds since the epoch, regardless of the format it can handle."

Luckily for you that is very much like the Postgres value and perfectly
a Java.util.Date type (because this is how it implemented),
both of which have that 4713 BC value as smallest negative value as
mentioned in the Postgres docs (but don't ask me how postgres really
maps the value to be stored as 8 bytes down down to microseconds,
since Java.util.Date is also 8 bytes and is good to milliseconds.

  1. define a mapping that says the field is mapped as a "date" type. You
    can specify a format for input and output conversion (but see note below
    when debugging and developing the code).

  2. insert specifying either
    2.1 a long integer
    taken directly from reading the long integer returned from postgres.
    For example, in Java
    long my_Datetime = my_datetime_asDate_FromJDBC.toTime()
    then using that to create the insert Json

     {  ... "my_datetime": 12345689... .... }
    

    2.2 as a formatted string FORMATTED ASSUMING THE UTC TIMEZONE (Java
    does not assume UTC when creating formatted string dates, be very
    careful about this) and using that in your insert Json

{ ... "my_datetime": "2012-01-19 18:38:16.123" .... }

Recommendation: print out the string formatted date in your code using a
formatter that prints the TZ just to check that the date and time make
sense as UTC. In Java this means _setting_the_TZ to UTC on the
SimpleDateFormat and showing the value of the fields including TZ. In
other languages, I can't help you.

Now you've got it in, so how to select using it?

Eg WHERE my_datetime AT TIME ZONE 'Europe/Paris' > /now/

Can someone let me know what the equivalent to this WHERE clause is in
elasticsearch. I expect it is something to do with the /Date Histogram
Facet/ but I'm not too sure.

Date histogram is for gathering statistics about ranges of dates and is
sort of like some of the aggregate functions in SQL; for example, count().

All you need is a search query similar to

$ curl-XPOST'http://localhost:9200/myIndex/myType -d '

"query": {
...
"range":{

     "my_datetime"  :  {  
          "gt":  "2013-01-19 00:00:00"
     }

}'
see the range query

There is a "now" value, see the page:

"now" will include THE CURRENT TIME OF DAY, because Java and ES dates
are a combination of a Date and a Time, so you have to define whether
you mean midnight at the beginning of today or RIGHT now for your
comparison.

If the date does not contribute to the score, I'd probably put the range
in the filter part of my query, but that is not required.

The string returned in the Json will be interpreted with the TZ of the
server but will not necessarily show the TZ. While testing you can see
the TZ by setting the format for your date field in your mapping to
"basic_date_time"or something else which includes the TZ

You will notice that the above does no include the step equivalent to
the postgres AT TIME ZONE. Why? Because to compare a "date" field to
"now" or a string like "2013-01-18" requires no extra conversion step of
the value in the index, but the
a date value ("2013-01-18", but not now) is converted using the defined
in/out format for the field.

Since it is NOT taken from running code, Please try all the above with a
simple scratch ES-index and print out dates and time with timeones at
all stages (reading from DB, before using a long to write etc.), in
order to see the interpretation of the values sent and returned. Only
then change the format to the most convenient for your end use.

I hope that helps,
-Paul

--

Hi Paul,

Thanks for the reply but I don't think I was quite clear about my situation.

In my postgres database I have a timestamp field/column (stored without
timezone) and I have a timezone field/column (can be any of the
string-style timezones eg Europe/London, Europe/Paris, etc).

I'm using the AT TIME ZONE function in my WHERE clause to convert all
records (using their associated timezone) so that I can just return the
records in the future no matter which timezone they are in.

So far I have the timestamp field in my ES mapping as a "date" type and the
timezone field as a "string" type.

Does that help clarify my situation a bit better?

Thanks

On Sunday, January 20, 2013 3:42:45 AM UTC, P Hill wrote:

On 1/19/2013 11:28 AM, amnesia7 wrote:

I have my_datetime and time_zone fields (my_datetime is a datetime
UTC field and time_zone is the string version of the timezone, eg
"Europe/Paris".

"datetime" is not a Postgres data type it is only a set of data types.
PostgreSQL: Documentation: 8.2: Date/Time Types

I currently use the AT TIME ZONE function in postgres to transform
dates on the fly in my where clause.

Since AT TIME ZONE takes a timestamp (or a time), I'll assume your
my_datetime column is what Postgres calls a timestamp.
In Postgres a timestamp can be either with or without a TZ, luckily "For timestamp
with time zone, the internally stored value is always in UTC" [ibid]. So
with or without is actually just a note on how to convert from ISO-8601
fields to the internal microsecond count value. This is great, because
this is similar to how ES does it.

Hopefully, If you were using timestamp with timezone, (1) you always
specified a TZ, (2) the server was running on UTC/GMT TZ
(to use as the default) or you were using Java JDBC which always sends a
java.util.Date which I'll assume means that internal long value from the
java Date is understood as an offset from 0 at 1970-01-01 00:00:00.000 UTC
and used directly.

I see the questions as:

  1. How to store a value as a some kind of time stamp.
  2. How to compare now to a time stamp?

Question 1: Storing

There is a Date type in Elasticsearch which is defined as:
"a special type which maps to JSON string type. It follows a specific
format that can be explicitly set. All dates are UTC. Internally, a date
maps to a number type long, with the added parsing stage from string to
long and from long to string.
[...]
The date type will also accept a long number representing UTCmilliseconds since the epoch, regardless of the format it can handle."

Elasticsearch Platform — Find real-time answers at scale | Elastic

Luckily for you that is very much like the Postgres value and perfectly a
Java.util.Date type (because this is how it implemented),
both of which have that 4713 BC value as smallest negative value as
mentioned in the Postgres docs (but don't ask me how postgres really maps
the value to be stored as 8 bytes down down to microseconds, since
Java.util.Date is also 8 bytes and is good to milliseconds.

  1. define a mapping that says the field is mapped as a "date" type. You
    can specify a format for input and output conversion (but see note below
    when debugging and developing the code).

  2. insert specifying either
    2.1 a long integer
    taken directly from reading the long integer returned from postgres.
    For example, in Java
    long my_Datetime = my_datetime_asDate_FromJDBC.toTime()
    then using that to create the insert Json

    {  ... "my_datetime": 12345689... .... }
    

    2.2 as a formatted string FORMATTED ASSUMING THE UTC TIMEZONE (Java
    does not assume UTC when creating formatted string dates, be very careful
    about this) and using that in your insert Json

{ ... "my_datetime": "2012-01-19 18:38:16.123" .... }

Recommendation: print out the string formatted date in your code using a
formatter that prints the TZ just to check that the date and time make
sense as UTC. In Java this means _setting_the_TZ to UTC on the
SimpleDateFormat and showing the value of the fields including TZ. In
other languages, I can't help you.

Now you've got it in, so how to select using it?

Eg WHERE my_datetime AT TIME ZONE 'Europe/Paris' > now

Can someone let me know what the equivalent to this WHERE clause is in
elasticsearch. I expect it is something to do with the Date Histogram
Facet
but I'm not too sure.

Date histogram is for gathering statistics about ranges of dates and is
sort of like some of the aggregate functions in SQL; for example, count().

All you need is a search query similar to

$ curl -XPOST 'http://localhost:9200/myIndex/myType -d '

"query": {
...
"range" : {

    "my_datetime" : { 
         "gt":  "2013-01-19 00:00:00"
    }

}'
see the range query
Elasticsearch Platform — Find real-time answers at scale | Elastic
There is a "now" value, see the page:
Elasticsearch Platform — Find real-time answers at scale | Elastic

"now" will include THE CURRENT TIME OF DAY, because Java and ES dates are
a combination of a Date and a Time, so you have to define whether you mean
midnight at the beginning of today or RIGHT now for your comparison.

If the date does not contribute to the score, I'd probably put the range
in the filter part of my query, but that is not required.
Elasticsearch Platform — Find real-time answers at scale | Elastic

The string returned in the Json will be interpreted with the TZ of the
server but will not necessarily show the TZ. While testing you can see the
TZ by setting the format for your date field in your mapping to
"basic_date_time"or something else which includes the TZ
Elasticsearch Platform — Find real-time answers at scale | Elastic

You will notice that the above does no include the step equivalent to the
postgres AT TIME ZONE. Why? Because to compare a "date" field to "now" or
a string like "2013-01-18" requires no extra conversion step of the value
in the index, but the
a date value ("2013-01-18", but not now) is converted using the defined
in/out format for the field.

Since it is NOT taken from running code, Please try all the above with a
simple scratch ES-index and print out dates and time with timeones at all
stages (reading from DB, before using a long to write etc.), in order to
see the interpretation of the values sent and returned. Only then change
the format to the most convenient for your end use.

I hope that helps,
-Paul

--

Can anyone offer a solution or any advice how to do this?

--

On Tue, 2013-01-22 at 12:19 -0800, amnesia7 wrote:

Can anyone offer a solution or any advice how to do this?

You need to reread Paul's extremely detailed and useful reply to your
original email.

Also, don't store dates in a particular timezone. Store them as UTC.
Timezones are for display purposes only

clint

--

Question for Clint near the end.

On 1/22/2013 12:25 PM, Clinton Gormley wrote:

On Tue, 2013-01-22 at 12:19 -0800, amnesia7 wrote:

Can anyone offer a solution or any advice how to do this?
You need to reread Paul's extremely detailed and useful reply to your
original email.

Also, don't store dates in a particular timezone. Store them as UTC.
Timezones are for display purposes only

clint

I think that "last mile" (last second?) conversion where Amnesia wants
to convert the time stamp value in the field to a string with the
specified timezone is not available as some trick in either:
a script in the search request that combines the two fields without
writing any a native script.
or
some specification in the mapping on the format used to generate the
json that dynamically uses the TZ field.

So I think Amnesia will have to take the formatted date returned in the
json and the field specifying the TZ and convert to a localized date
string in client-side code.

In Java that would be two steps, Either

  1. Go from returned String of the form "2013-001-23 23:55:59.000" and
    parse to a java.util.Date (a long internally),
    then reformat this java.util.Date into whatever format you are going to
    use in your application with the timezone that came from the index and
    create your String.
    or
  2. using the appropriate Joda time static (?) factory function what
    assumes or can be told to use UTC to create a
    DateTime. Then create another DateTime asking to move it to a different
    TZ. Format the String as needed.

Since I hate to jump through conversion steps (because I usually fail
the 1st few times since it involves the TZ of two different machines and
using a 3rd explicit value), I might try to generate a "script_field"
(look it up if you are not following along) that contains the long that
is really stored in the index. Too many places to forget what TZ is in use.

Clint, could Amnesia use a script that involves?

"field['timestampField'].longValue"

to simply keep it as a long and skip any conversion in ES?

This way Amnesia could convert on the client-side directly from the
stored time stamp long value and the given TZ into the needed local time
String.

I have provided plenty of useful information and really don't have time
to try this, I hope Amnesia gives it a try and reports back.

-Paul

--

Hi Paul

Clint, could Amnesia use a script that involves?

"field['timestampField'].longValue"

Yes

to simply keep it as a long and skip any conversion in ES?

This way Amnesia could convert on the client-side directly from the
stored time stamp long value and the given TZ into the needed local time
String.

Yeah - you have been very helpful, but juggling timezones in the DB
feels like the wrong approach. Just too many places where you can go
wrong.

clint

--

Thanks for the help.

I'm using the RoR Tire gem which allows methods to be used to pass values
into ES.

I was just wondering if I can use my timestamp field + my TZ field to
create an extra field in ES that is a timestamp with TZ and is for the
purpose of this where clause/filter.

So my mapping would include indexes timestamp_field type 'date', indexes
tz_field type 'string', indexes tz_timestamp_field type 'date' or something
like that and then I do my filter based on tz_timestamp_field.

I'll have to give it a try.

--

On 1/24/2013 10:36 AM, amnesia7 wrote:

Thanks for the help.

I'm using the RoR Tire gem which allows methods to be used to pass
values into ES.

I was just wondering if I can use my timestamp field + my TZ field to
create an extra field in ES that is a timestamp with TZ and is for the
purpose of this where clause/filter.

Any selecting or filtering only needs to be able to compare values for
<,> or =.
A Date in ES is not a bunch of fields 2013-01-24 T 23:45:00.003 it is a
long integer, so any range selection for NOES NOT NEED TZ conversion.

I'm worried you are not getting the idea of how the value is stored in
ES (or Java). There are no date-time fields stored (hour, minute, month
...), the definition of the format is for converting on the way in and
converting on the way out for the strings in the json. Comparison for
ordering needs no TZ information.
The sun came up somewhere in North America at 2013-01-24 15:43:00.000 Z
(UTC) (3:43 in the afternoon London time, but 7:43 in the morning
somewhere on the west coast).
All values in the index are stored as Long integers, thus that time is
stored as 1327419900000L. It is just a number. Therefore, you don't
need to convert to any TZ in order to compare.

But creating the right field for comparison at search is often a very
good general rule.
But do you want to know if a date is past midnight UTC, past the current
moment in time, past midnight local time?
None of these require you to know what TZ has been recorded in the index
along with the time.

-Paul

--

I've added an extra UTC field (dynamically generated and then passed into
the ES index) that I can use for comparison when searching instead which
works fine so unless something changes I can put this to rest now.

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.