Date_histogram - extended bounds, converting null to 0

Hello,

I have a collection of rooms in Elasticsearch 7.11.0, where each room has a data and a capacity for that date:

PUT /capacities_index
{
  "mappings": {
    "properties": {
      "date": {
        "type": "date"
      },
      "capacity": {
        "type": "integer"
      }
    }
  }
}

Data for these three dates:

POST _bulk
{"index":{"_index":"capacities_index"}}
{"date":"2023-01-01", "capacity": 10 }
{"index":{"_index":"capacities_index"}}
{"date":"2023-01-02", "capacity": 5 }
{"index":{"_index":"capacities_index"}}
{"date":"2023-01-03", "capacity": 3 }

I have the following query:

GET /capacities_index/_search?size=0
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "date": {
              "gte": "2023-01-01",
              "lte": "2023-01-04"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "per_day": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "day",
        "min_doc_count": 0,
        "extended_bounds": {
          "min": "2023-01-01",
          "max": "2023-01-04"
        }
      },
      "aggs": {
        "capacity_by_day": {
          "min": {
            "field": "capacity"
          }
        }
      }
    },
    "capacity": {
      "min_bucket": {
        "buckets_path": "per_day>capacity_by_day"
      }
    }
  }
}

I want to find the minimum capacity over a date range, but my range my not correspond to the dates I have in my index. For the query above, there isn't a specific entry for "2023-01-04", so I want the "capacity" value for this date to resolve to 0.

I get this (relevant part only) response:

"aggregations" : {
    "per_day" : {
      "buckets" : [
        {
          "key_as_string" : "2023-01-01T00:00:00.000Z",
          "key" : 1672531200000,
          "doc_count" : 1,
          "capacity_by_day" : {
            "value" : 10.0
          }
        },
        {
          "key_as_string" : "2023-01-02T00:00:00.000Z",
          "key" : 1672617600000,
          "doc_count" : 1,
          "capacity_by_day" : {
            "value" : 5.0
          }
        },
        {
          "key_as_string" : "2023-01-03T00:00:00.000Z",
          "key" : 1672704000000,
          "doc_count" : 1,
          "capacity_by_day" : {
            "value" : 3.0
          }
        },
        {
          "key_as_string" : "2023-01-04T00:00:00.000Z",
          "key" : 1672790400000,
          "doc_count" : 0,
          "capacity_by_day" : {
            "value" : null
          }
        }
      ]
    },
    "capacity" : {
      "value" : 3.0,
      "keys" : [
        "2023-01-03T00:00:00.000Z"
      ]
    }
  }

The capacity_by_day value is null (can I change this to 0?), and the capacity min_bucket aggregation ignores the null value.

Is there a way to convert this to return 0 for the range?

Or perhaps, I'm going about this completely the wrong way, and there's a better alternative to date_histogram?

Thanks for looking at my question :slight_smile:

Set gap_policy in min bucket aggregation.

"capacity": {
      "min_bucket": {
        "buckets_path": "per_day>capacity_by_day",
        "gap_policy": "insert_zeros"
      }
    }
1 Like

Great! Thanks so much @Tomo_M :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.