Hello,
I have some question about rollup API. Can anybody help me?
I already have about 100 daily indices. now I want to try rollup API, but I want to rollup from now on, can this be achieved?
As data growing, the rollup_index shard size is getting bigger. But the Best Practices is set shard size between 20GB and 40GB, It's necessary to delete old data regularly?
Hm, I suppose that could be tricky to keep the rollup job from backfilling. You can specify a pattern (logstash-* for example), but that will also include all the existing indices that match.
Right now, the best way is probably with an alias that matches only the indices you care about. The difficulty is that you'll need to continually update the alias as new indices are added and old indices are removed.
We would like to add a bit more filtering capability to rollup jobs, but it's not possible right now.
Agreed, this is a relatively important feature that's missing at the moment (issue here if you want to follow: [Rollup] Managing index lifecycle · Issue #33065 · elastic/elasticsearch · GitHub). We plan to integrate Rollup with ILM soon, which will allow configuring custom rollover when the index hits a certain size, etc. Right now the only option is to let it grow, or start deleting data... neither of which are good solutions.
For the first question, I found a way to solve it. Before I create the rollup job, I closed the old indices. After this job finish indexing, I reopen those old indices, and the rollup index will not handle them
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.