Is there a way to count the number of documents that fall within an interval, based on two fields?
For example, say you had documents with start and end times. Instead of intervaling over one field (like the start time, in this case), I want intervals that count how many documents are completely or partially contained within each time interval, based off of their start and end times.
I understand how to do this for only one interval using ranges, like:
"query": {
"bool" : {
"must" : [
{
"range" : {
"start_time" : {
"lt" : "2018-04-12T00:05:00.000Z" (the end time of the interval)
}
}
},
{
"range" : {
"stop_time" : {
"gt" : "2018-04-12T00:04:45.000Z" (the start time of the interval)
}
}
}
...
}
This returns the correct values for one sub-interval, but I would ideally like to execute a query and get the correct values for a time range containing multiple intervals. Is this possible?
Interesting, I didn't realize there are range data types. I looked into it, but this would still only allow me to get one interval at a time, right? Histogram and range aggregations don't seem to be compatible with range data types, and the range query only works on one range.
We would like to make histogram and range aggregations able to work on range fields (https://github.com/elastic/elasticsearch/issues/23182) but this will take some time. In the meantime, your best option would be to use a filters aggregation I suppose, with one filter per range that you want to count, regardless of whether you run this range with two fields or a single range field.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.