Data aggregation and storage for specific time interval

wq_Huang · December 1, 2015, 8:34am

My deployed ES cluster will receive a huge amount of data per second,every doc it stores has some numeric fields.The aggregation of these fields,such as min,max,count,average,is done on the fly.The problem is that with so much raw data in ES,the calculation process will be unbearable slow,what's worse,I can not store the data for a very long time such as 1 year due to limited storage space and it's unnecessary to store all of the raw data for a whole year.It will be much better to do aggregation like every 10s,1min,10min,etc,then the aggregated data will be stored in ES and the raw data can be discarded.ES,say,will store raw data for 1 month,10s aggregated data for 6 months,10min aggregated data for a year, by that way,I do not need to store the raw data for a year while the trends of the data can remain with a decreased precision.

Is there anyway to accomplish that? I know statsd+graphite can do the job,but I don't wanna deploy and maintain one more system.I am currently using Logstash+ES+Kibana for data filter,storage and representation respectively.

Thanks.

Topic		Replies	Views
Best practice to save aggregated data to elasticsearch for long time storage? Elasticsearch	1	3382	December 13, 2017
Storing aggregation in elasticsearch Elasticsearch	2	600	March 4, 2020
Rollup data in ES Elasticsearch	3	1628	July 6, 2017
Keep summary data Elasticsearch	2	780	July 5, 2017
ES as a timeseries metrics store? Elasticsearch	2	484	July 6, 2017

Data aggregation and storage for specific time interval

Related topics