I’m evaluating the frozen tier functionality in Elastic Cloud deployment, aiming to keep around ten years of historical data available for occasional analysis and compliance purposes.
Given that we plan to retain data for such a long period, what should we take into account regarding the use of the frozen tier and the compatibility between Elasticsearch versions and the underlying snapshots? Are there any known limitations or recommended best practices to ensure long-term accessibility?
Also, is there any way to limit access to indices once they move to the frozen tier? Ideally, only a few users should be able to query those historical indices, but since permissions are applied at the data stream level, it’s not clear to me how to handle that separation.
Any guidance or examples would be greatly appreciated.
Thanks!
This is Big question. Lots of aspect to consider from my perspective.
First Simple stuff
1st:
First here is the snapshot compatability guide.... In general you can see "backwards compatability ... but will that hold for 10 years ... into the future... that is hard to say.
2nd:
This is easy. Answer is yes you can create roles that limit based on time range or event just tier...
Example role can include this and it will exclude all frozen tier from any search.
Really appreciate the detailed response — it’s very helpful.
This is indeed a big topic, and your points help frame it much better.
This is exactly what I was looking for! Thanks!
I’ve checked the snapshot compatibility guide you mentioned — it makes sense that predicting compatibility that far into the future is uncertain. My main concern is simply making sure that using the frozen tier for long-term retention doesn’t create any hard limitations (for example, managing a very large number of mounted snapshots , too big repository) or migration challenges later on.
We’re in a very early evaluation stage and don’t have concrete answers yet to the questions you raised. The goal for now is to understand what technical options and constraints exist before moving forward.
At this stage, we’re considering this approaches:
Long-term snapshots stored in S3 o Azure (restore on demand or partial mount as needed)
A pure frozen tier approach (now that we know how to filter privileges by tier)
A mixed strategy — theoretically (please correct me if I’m wrong), if a snapshot policy shares the same repository used by the frozen phase, it would be possible to retain data in the frozen tier for a shorter period while keeping the full backup for the total required retention time.
The idea is that, due to the nature of snapshots, repository storage usage would remain efficient, and when the frozen phase ends, the delete action would only unlink the searchable snapshot while the underlying data would still be referenced by the main backup policy. Would such a scenario be feasible?
The motivation behind this is to avoid maintaining an increasingly large frozen tier over many years (with numerous indices, active ILM policies, and snapshots mounted in the cluster), which could eventually introduce significant load or operational overhead.
Regarding this,
if I understood correctly, that refers to storing the original logs as raw files in S3 for long-term retention, while keeping indexed only recent or relevant data in Elastic. Would this approach also make sense for metrics or APM data, where the raw format isn’t as straightforward as text logs?
I also appreciate the offer to connect — I’ll likely reach out directly once we have a clearer picture of the use case and some concrete numbers.
I say raw generically but we have users that process the data in logstash even with pipelines or integrations and then write the processed logs to S3. So that's another option.
Traces and metrics....
That opens another line of thinking.
Typically people would do a roll-ups / down sampling or something for longer term, especially for metrics...
Really old metrics 10-year-old metrics. Pretty low value..
I'd have to think about traces a bit more, but there is ways to bifurcate the flow
I think the value of traces also reduces drastically unless there's some very valuable metadata in it
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.