I'm trying to average a certain metric within every 5 minutes for an hour, from the current time. So there would be 12 data points for an hour, where each data point would be the average value of that metric in the 5 mins. Currently the metric is dumped into Elastic for every 10 seconds. I was able to write the below query:
GET /%3Cindexname-%7Bnow%2Fd%7D%3E/_search
{"size": 0,
"query":{
"range":{
"collectionTime":{
"gte":"now-1h/h",
"lt":"now/h",
"boost": 2.0
}
}
},
"aggs": {
"time_buckets": {
"date_histogram": {
"field": "collectionTime",
"interval": "300s"
},
"aggs": {
"some_avg": {
"avg": {
"field": "field_to_be_averaged"
}
}
}
}
}
}
The problem with this query is:
- It's generating 13 buckets instead of 12. 2) The buckets aren't created from "now"-"now-5m" and so on. For instance, if this query were to be run at 11:27 am, the desired output would be 12 averaged values from 11:27 am - 10:27 am. But the obtained output is from 9:30 am - 10:30 am. Guess the date histogram rounds to the nearest multiple of 1,4 or 5.
I plan to implement this query in a Java service which would talk to Elastic using it's High Level Rest Client. Any insights/pointers on that would be really helpful!