Date Histogram: Can I force the last bucket to be a full interval?


#1

Hi!

Is it possible to have the intervals count backwards from the end of the range? I would like it so that the last bucket returned from a date_histogram aggregation was a complete interval, ending on the last day of my date range.

For example, this

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "date": {
              "format": "yyyy-MM-dd-HH-mm-ss",
              "gte": "2017-09-04-00-00-00",
              "lte": "2017-10-04-23-59-59"
            }
          }
        }
      ]
    }
  },
  "aggregations": {
    "time_agg": {
      "date_histogram": {
        "field": "date",
        "interval": "week",
        "min_doc_count": 0,
        "extended_bounds": {
          "min": "2017-09-04",
          "max": "2017-10-04"
        }
      }
    }
  }
}

Returns these buckets:

  "buckets": [
{
  "key_as_string": "2017-09-04T00:00:00.000Z",
  "key": 1504483200000,
  "doc_count": 11453
},
{
  "key_as_string": "2017-09-11T00:00:00.000Z",
  "key": 1505088000000,
  "doc_count": 14083
},
{
  "key_as_string": "2017-09-18T00:00:00.000Z",
  "key": 1505692800000,
  "doc_count": 14526
},
{
  "key_as_string": "2017-09-25T00:00:00.000Z",
  "key": 1506297600000,
  "doc_count": 14904
},
{
  "key_as_string": "2017-10-02T00:00:00.000Z",
  "key": 1506902400000,
  "doc_count": 1447
}

]
The last bucket returned is not a full week interval. It starts on Oct. 2 and goes until Oct 4.

I would like the buckets returned to be:

  "buckets": [
{
  "key_as_string": "2017-09-04T00:00:00.000Z",
  "key": ???,
  "doc_count": ???
},
{
  "key_as_string": "2017-09-06T00:00:00.000Z",
  "key": 1505088000000,
  "doc_count": ???
},
{
  "key_as_string": "2017-09-13T00:00:00.000Z",
  "key": ???,
  "doc_count": ???
},
{
  "key_as_string": "2017-09-20T00:00:00.000Z",
  "key": ???,
  "doc_count": ???
},
{
  "key_as_string": "2017-10-27T00:00:00.000Z",
  "key": ???,
  "doc_count": ???
}

]
The last bucket in the desired results is a complete week interval, ending on the last day specified in the date range. If any bucket is not a complete interval, it should be the first bucket, which contains the oldest data.

Is it possible to ensure that the lat bucket returned will be a complete interval?


(Mark Walkom) #2

If you are building these queries yourself then you can use relative date ranges, eg now-7d.
See https://www.elastic.co/guide/en/elasticsearch/reference/5.6/common-options.html#date-math


#3

@warkolm thanks for the response!

I've tried using relative date ranges to modify the extended_bounds of the aggregation so that the total length of the range would divide evenly by the interval. However, there are still cases where the last bucket returned is still in incomplete interval.

Is modifying the extended_bounds parameter using relative date ranges what you had in mind? Or is there another way to use relative date ranges to ensure the last bucket is a complete interval?

If you were thinking of a different way, could you please provide a quick example?

Thanks again for the help!


(Mark Walkom) #4

extended_bounds would be the best way if you're still getting incomplete buckets.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.