Rollup strategy in Elastic

I am looking for a feasible way to rollup data I have stored in Elasticsearch. The records I have are time series based, and can be grouped by a timestamp, host, and path of a URL. What I had in mind was a cron job that looks at all the records 1 day old, not yet merged into a granularity. It would then bulk write the new merged records into the same index, and once completed run a delete by query where there is no granularity within the specified date range. I would eventually want to configure the cron job to run at the monthly/yearly granularity as well as single document granularity once it reaches a certain age.

What I am unsure about is the strategy needed to aggregate the data. For an input where there would be millions of records to aggregate, is this something I can handle with a single elastic query that fetches the aggregations, or will I have to use something such as Hadoop/MapReduce to read and aggregate the data? Would it be better to store granularities in separate indexes to make the rollup job easier?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.