Elasticsearch Snapshots in a Dedicated S3 Bucket (Cross-Region, Multi-Deployment)

Hello

I have a question about snapshots in Elasticsearch.
We use Elastic Cloud via AWS and want to use a dedicated S3 bucket in another region for snapshots instead of found-snapshots so that we can set up our deployment again if it fails completely.

However, we have a problem with the partial indices for searchable snapshots. Currently, we cannot restore a snapshot because the partials are located in found snapshots, which of course makes sense.
As a test, I created a test deployment and created partial indices there that are stored in the S3 bucket, but the partial indices do not work there either after a restore.

In addition, I always get an error when I link the S3 bucket in multiple deployments and create snapshots. It then does not display any snapshots in the other deployments, only the error. But as soon as I edit the bucket in the repository, without changing anything, and simply save it, everything works again.

What would be the correct way to store snapshots in a dedicated S3 bucket located in another region and restore an snapshot with working partial indices?

This will be very expensive: cross-region traffic carries significant costs per GiB.


You would have to put the searchable snapshots into the remote-region repository too and this would also be very expensive. See these docs particularly:

Most cloud providers charge significant fees for data transferred between regions and for data transferred out of their platforms. You should only mount snapshots into a cluster that is in the same region as the snapshot repository. If you wish to search data across multiple regions, configure multiple clusters and use cross-cluster search or cross-cluster replication instead of searchable snapshots.


See these docs, particularly:

  • Clusters should only register a particular snapshot repository bucket once. If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. On other clusters, register the repository as read-only.

Thanks for your answer!
Okay, costs are not so much a problem.
We need just a good backup strategy.

There is the restore from another deployment feature in Elastic Cloud, but i can’t use this with Terraform and I think if the deployment is down or broken this will not work?

If you want a backup you do not need searchable snapshots, just normal snapshots, then the partial indices will not be the problem.

I have a deployment on Elastic Cloud where I have searchable snapshots, managed by ILM, and normal backup snapshots, manages by SLM.

Those backups snapshots are created for backup only and are stored on a bucket that we own.

I only have this for a couple of data streams, but you can have for any data you want.

Basically you will duplicate the data, but these extra snapshots would be stored on your own S3, not the one managed by Elastic.

1 Like

Yes, restoring without partials works. But then we have not all data on the new deployment.

My plan was to have everything backedup if something happens, but also to automatic setup a new deployment with terraform and restore everything from the snapshot, with all date if the first cluster breaks or is down. Maybe a region is down or I have set up something wrong or whatever.
The idea was to save normal snapshots and serachable snapshots in one bucket in another region.

I have seen there is way to backup a repository?
Is this possible too for found-snapshots?

Sry if I don’t understand everything, english is not my native language and I am only the intern in the company. Better don’t ask about how I got this task :sweat_smile: