I'm attempting to do a date_histogram aggregation on a date_range field now that it is supported in 7.4.1. This works well for fully defined date ranges (the first document in my example), but I have a large number of documents that have an undefined end point (value is null, second document in my example). When I run a date_histogram aggregation that includes those documents, I get an out of memory error. I have been looking for a way to limit the buckets created in the query, but without luck. I've also tried to use a date_range aggregation on the fields to attempt to limit the potential buckets, but that caused cast exceptions.
Basically I'm attempting to find all of the months up to now that these documents cover/touch. I'd like to see buckets from 2017-10-01 to 2019-11-01 (as of the time of this topic being written).
Here is a test case with only 2 documents:
PUT test
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"active_range": {
"type": "date_range",
"format": "yyyy-MM-dd"
}
}
}
}
POST test/_doc/
{
"active_range": {
"gte": "2017-10-10",
"lte": "2018-10-10"
}
}
POST test/_doc/
{
"active_range": {
"gte": "2017-10-10",
"lte": null
}
}
Now that the index is setup, run the following aggregation to see the problem:
GET test/_search
{
"size": 0,
"aggs": {
"active": {
"date_histogram": {
"field": "active_range",
"calendar_interval": "month"
}
}
}
}
Any suggestions would be greatly appreciated.