Elasticsearch aggregation time range on streaming data

Hello everyone,

We are developing an external ML analysis tool which pulls existing data from an ES index, creates a model from it, and then keeps querying the engine for new (more recent) data to assess the model and output some results.

Our data is aggregated by timestamp (date histogram), extracting a few metrics (aggregations such as avg, cardinality, etc.) The aggregation interval may be anything (like '5m'). The problem araises when dealing with the last bucket including 'now', which will not be complete and thus we should wait for it to be filled with more documents before taking it into account.

However, in case the aggregation interval is "5m", we are not allowed to set a time range limit like "le": "now/5m", because only 1 can be used as number. So, the best we can do is remove the "le" constraint and drop the last bucket returned (which could include now if this interval was populated already, or be absent if not, removing a valid bucket in this case).

Is there any way we can get the data "up to the last complete 5m interval"? I am including the query to illustrate the problem.

Thanks in advance.

POST index/_search
{
"aggregations":{
"intervs":{
"aggregations":{
"count_field_name":{
"cardinality":{
"field":"field.name.keyword"
}
}
},
"date_histogram":{
"field":"@timestamp",
"interval":"5m",
"min_doc_count":1
}
}
},
"query":{
"range":{
"@timestamp":{
"gte":"<last_complete_bucket_timestamp>||+5m",
"lt":"now/5m"
}
}
},
"size":0
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.