Which repository do you use to snapshot & restore large datasets?

igor_k · July 8, 2015, 8:46am

Hi,

We want to introduce snapshot and, in case of failure, restore to our ES setup.

In each of our clusters we have over a few TBs of primary data (*1-2 depending on replication factor). We experiment with google cloud storage in S3 compatibility mode as a repository, but the recovery rates are abysmally slow. Looks like we are being throttled by them.

I wonder if you do snapshot and restore on large data sets. What is the size of your data? Which repository do you use? What rates do you have? Are you happy with the solution? Just share your experience

Cheers,
Igor

warkolm · July 8, 2015, 9:59am

S&R doesn't take a copy of the replicas, just the primaries.

igor_k · July 8, 2015, 11:09am

I know :-), but my question was more about which repos do you us,
what is your dataset size and what is your experience with S&R.

I think I'll be a good reference point for us on what is possible.

Thanks,
Igor

warkolm · July 8, 2015, 11:13am

We see a lot of people use S3 if they are already in AWS, they can then potentially leverage Glacier for long term storage

Topic		Replies	Views
Snapshot Repositories Elasticsearch	2	253	December 15, 2020
Elasticsearch backup - S3 repository Elasticsearch	2	182	April 28, 2023
Snapshot repository size Elasticsearch	2	515	July 5, 2017
Snapshot and Restore - Repo Elasticsearch	5	304	March 21, 2019
Handling repository snapshots Elasticsearch	4	838	May 20, 2018

Which repository do you use to snapshot & restore large datasets?

Related topics