'offset' parameter on a date histogram aggregation does not respect the date range filter

When I use 'offset' parameter, the aggregation does not respect the range filter.

Tested on 7.9.2:

vagrant@dev:~$ curl http://localhost:9200
{
  "name" : "dev",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "inV9zqxvTPW0UecZm3_rhw",
  "version" : {
    "number" : "7.9.2",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "d34da0ea4a966c4e49417f2da2f244e3e97b4e6e",
    "build_date" : "2020-09-23T00:45:33.626720Z",
    "build_snapshot" : false,
    "lucene_version" : "8.6.2",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Steps to reproduce:

  1. Create a new index:
curl -H 'Content-Type: application/json' -XPUT http://localhost:9200/offset_issue/ -d'{
  "mappings": {
    "properties": {
      "time": {
        "type": "date",
        "format": "date_hour_minute_second"
      },
      "value": {
        "type": "float"
      }
    }
  }
}'
  1. Sample content. Save to a file called values.json:
{ "index":{} }
{"time": "2020-08-01T05:00:00", "value": 1}
{ "index":{} }
{"time": "2020-08-01T05:05:00", "value": 2}
{ "index":{} }
{"time": "2020-08-01T05:10:00", "value": 3}
{ "index":{} }
{"time": "2020-08-01T05:15:00", "value": 4}
{ "index":{} }
{"time": "2020-08-01T05:20:00", "value": 5}
{ "index":{} }
{"time": "2020-08-01T05:25:00", "value": 6}
{ "index":{} }
{"time": "2020-08-01T05:30:00", "value": 7}
{ "index":{} }
{"time": "2020-08-01T05:35:00", "value": 8}
{ "index":{} }
{"time": "2020-08-01T05:40:00", "value": 9}
{ "index":{} }
{"time": "2020-08-01T05:45:00", "value": 10}
{ "index":{} }
{"time": "2020-08-01T05:50:00", "value": 11}
{ "index":{} }
{"time": "2020-08-01T05:55:00", "value": 12}
{ "index":{} }
{"time": "2020-08-01T06:00:00", "value": 13}
{ "index":{} }
{"time": "2020-08-01T06:05:00", "value": 14}
{ "index":{} }
{"time": "2020-08-01T06:10:00", "value": 15}
{ "index":{} }
{"time": "2020-08-01T06:15:00", "value": 16}
{ "index":{} }
{"time": "2020-08-01T06:20:00", "value": 17}
{ "index":{} }
{"time": "2020-08-01T06:25:00", "value": 18}
{ "index":{} }
{"time": "2020-08-01T06:30:00", "value": 19}
{ "index":{} }
{"time": "2020-08-01T06:35:00", "value": 20}
{ "index":{} }
{"time": "2020-08-01T06:40:00", "value": 21}
{ "index":{} }
{"time": "2020-08-01T06:45:00", "value": 22}
{ "index":{} }
{"time": "2020-08-01T06:50:00", "value": 23}
{ "index":{} }
{"time": "2020-08-01T06:55:00", "value": 24}
{ "index":{} }
{"time": "2020-08-01T07:00:00", "value": 25}
  1. Load the data:
curl -H 'Content-Type: application/x-ndjson' -XPOST http://localhost:9200/offset_issue/_bulk --data-binary @values.json
  1. Make a query without offset for 6:00 to 6:25 in 15 minute buckets (2 buckets expected):
curl -H 'Content-Type: application/json'  -XPOST http://localhost:9200/offset_issue/_search?pretty -d '{
  "size": 0,
  "aggs": {
    "byDateInterval": {
      "date_histogram": {
        "field": "time",
        "interval": "15m",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "aggregations": {
        "aggregationConsumption": {
          "max": {
            "field": "value"
          }
        }
      }
    }
  },
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "time": {
              "gte": "2020-08-01T06:00:00",
              "lte": "2020-08-01T06:25:00"
            }
          }
        }
      ]
    }
  }
}'

Result (as expected):

{
  "aggregations": {
    "byDateInterval": {
      "buckets": [
        {
          "key_as_string": "2020-08-01 06:00:00",
          "key": 1596261600000,
          "doc_count": 3,
          "aggregationConsumption": {
            "value": 15
          }
        },
        {
          "key_as_string": "2020-08-01 06:15:00",
          "key": 1596262500000,
          "doc_count": 3,
          "aggregationConsumption": {
            "value": 18
          }
        }
      ]
    }
  }
}
  1. Make a query with 5 minutes offset for 6:00 to 6:25 in 15 minute buckets (2 buckets expected as 6:05 to 6:15 and 6:20 to 6:25):
curl -H 'Content-Type: application/json'  -XPOST http://localhost:9200/offset_issue/_search?pretty -d '{
  "size": 0,
  "aggs": {
    "byDateInterval": {
      "date_histogram": {
        "field": "time",
        "interval": "15m",
        "format": "yyyy-MM-dd HH:mm:ss",
        "offset": "+5m"
      },
      "aggregations": {
        "aggregationConsumption": {
          "max": {
            "field": "value"
          }
        }
      }
    }
  },
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "time": {
              "gte": "2020-08-01T06:00:00",
              "lte": "2020-08-01T06:25:00"
            }
          }
        }
      ]
    }
  }
}'

Result:

{
  "aggregations": {
    "byDateInterval": {
      "buckets": [
        {
          "key_as_string": "2020-08-01 05:50:00",
          "key": 1596261000000,
          "doc_count": 1,
          "aggregationConsumption": {
            "value": 13
          }
        },
        {
          "key_as_string": "2020-08-01 06:05:00",
          "key": 1596261900000,
          "doc_count": 3,
          "aggregationConsumption": {
            "value": 16
          }
        },
        {
          "key_as_string": "2020-08-01 06:20:00",
          "key": 1596262800000,
          "doc_count": 2,
          "aggregationConsumption": {
            "value": 18
          }
        }
      ]
    }
  }
}

Question: Why does the start time of the first bucket goes back by 10 minutes? (15 minute interval and 5 minute offset?) How to get buckets that starts with the given date range but with the added offset? (eg. from 6:05)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.