Is it possible to move a one time snapshot to Glacier?

Hi,

In the docs, it states that after snapshotting an index to S3 you shouldn't transition it to Glacier. I have a use case where I would like to snapshot an index as a one off event and store it as cheaply as possible, but still recover in the off chance we'd ever need it again.

In an ideal world I would make a snapshot for a given base_path (folder in bucket) and then transition that base_path to glacier. Any other snapshotting in the meantime will happen on a different base_path. Then if we ever needed to recover I would transition back to standard s3 on the given base_path and restore. Would this be a valid workaround? Or is there an interaction between elasticsearch and s3, or standard s3 and glacier that I'm missing?

Many thanks,
John

From documentation:
"You may use an S3 Lifecycle Policy to adjust the storage class of existing objects in your repository, but you must not transition objects to Glacier classes and you must not expire objects. If you use Glacier storage classes or object expiry then you may permanently lose access to your repository contents. For more information about S3 storage classes, see AWS Storage Classes Guide"

I think I would kind of expect that to work: as long as you can bring the repository back into the exact state that Elasticsearch left it, how would it know any different? It's effectively a repository backup.

Thanks for the quick reply!

That was my thoughts too but I want to store the data long term and I can't recreate it. So I need to be certain this will work before continuing. Given that criteria if I ran a small test, would you think that would be conclusive? Or if I need it to be conclusive, should I use an alternative approach?

To qualify those qualifiers :smile: : I'm one of the developers on the team responsible for snapshots, so I should have said that this will definitely work as long as you (a) follow the repository backup procedure, and (b) bring the repository back into the state that Elasticsearch left it before you try and use it again. ES only cares about the objects' names and content, there's no other magic going on. My only uncertainty is whether you can restore the objects back faithfully, which is something over which I have no control.

Sounds like a good idea. But I'd also recommend testing the process with the actual data, both before you delete the data from the cluster, and periodically after that as well. As the saying goes: backups always succeed, it's only restores that fail.

1 Like

Excellent, thank you for the help.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.