Bucket keys for date histogram aggregation don't respect timezone


#1

I'm trying to do a date histogram using Spring Data Elasticsearch, allowing a customizable time zone. I'm seeing strange behavior. The doc counts in each bucket change with the time zone, which seems to indicate that the time zone is being used in some way. But the bucket key still seems to stay in GMT. I tried doing this query using the search API just to make sure it wasn't a problem with my Java code:

{
  "aggs": {
    "by_month": {
      "date_histogram": {
        "field":     "dates.created",
        "interval":  "month",
        "time_zone": "America/New_York"
      }
    }
  }
}

And here is a sample of the results:

    {
       "key_as_string": "2015-11-01T00:00:00.000Z",
       "key": 1446336000000,
       "doc_count": 300
    },
    {
       "key_as_string": "2015-12-01T00:00:00.000Z",
       "key": 1448928000000,
       "doc_count": 500
    },

Note that these times (both the key_as_string and key) are at midnight GMT instead of EST. Is there something I'm doing wrong here, or is this expected behavior? My Elasticsearch cluster is running version 1.7.1 if that helps at all.


(Adrien Grand) #2

I would advise against using the time_zone parameter in 1.7 since it is quite buggy. However it got fixed in Elasticsearch 2.0. What this parameter does is that it computes buckets based on the given time zone, eg a daily aggregation will project all date values to midnight in the specified timezone instead of UTC. See https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking_20_aggregation_changes.html#_time_zones_and_offsets and https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html#_time_zone for more information.


#3

Thanks for your reply. Is there any 1.x version where it works properly? Or is there some equivalent in 1.x versions (based on my digging around, perhaps pre_zone and post_zone)?


(system) #4