Timeseries with startDate-endDate. Aggregations on different date intervals

Hi team,

Problem statement
I want to store events over time as new record in my storage system. imagine I want to monitor taxi location changes. I could keep data in a format [taxi_id, time_start, time_end, location_id]

Location can be some kind of geoFence or anything.

Type of Queries
Then I want to query various things within a time interval. e.g how much time a taxi driver spent in location_1 in interval [11:00 -12:00] today etc

The nature of my data is clearly for time-series storages. Since I add a new record every time an event is triggered.

The problem that I have here is that I may have a single record spanned within a bigger interval than the one given from my user (e.g UI application)

e.g record [Robert Denniro, 12/13 10:00 - 12/13 11:30, location_1]

If the query is how much time Robert spent in interval [12/13 10:00 - 12/13 11:00]
I can't aggregate with something fixed like SUM(EndTIme-StartTime). I need to check overlaps between each record in my DB and user's interval(s)

#2 questions:

  1. is Elasticsearch good for this kind of queries (data having interval (with startTime , EndTime) )?
  2. Does Elasticsearch provide some support out of the box for this kind of aggregations ?

Clearly there are 2 approaches for calculating such queries :=

  1. you do the calculation against raw data records. This case a script/method should be executed to find intersections between 2 dates etc (the script should be executed for each row)
  2. you cook the data in advance. Such way that when user provides an interval then all the records fall into this interval. This way I could use a
    Select SUM(EndTime-StartTime) aggregation

I'd appreciate your feedback on this.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.