Configured S3 repository on multiple clusters with write-access, data corrupted

Tariq_Yahya · March 18, 2024, 12:46pm

I added the same S3 repository used by one of my cluster to another cluster and I forgot to give it read-only access. It states in the documentation that this action could cause data corruption :

"If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository . Having multiple clusters write to the repository at the same time risks corrupting the contents of the repository."

I need to restore this data, is this possible or is the data lost forever?

DavidTurner · March 18, 2024, 8:58pm

We can't offer any guarantees but you might be lucky. All reasonably recent versions of ES try really quite hard to avoid repository corruption in this situation. The only way to find out for sure is to try it tho.

Tariq_Yahya · March 18, 2024, 9:13pm

What exactly should I try doing? Adding the repository again? My cluster is version 7.17 btw.

Is updating the cluster and option?

DavidTurner · March 18, 2024, 9:26pm

What is the exact error message you're seeing?

Tariq_Yahya · March 18, 2024, 10:12pm

I got this error once

Could not read repository data because the contents of the repository do not match its expected state. This is likely the result of either concurrently modifying the contents of the repository by a process other than this cluster or an issue with the repository's underlying storage. The repository has been disabled to prevent corrupting its contents. To re-enable it and continue using it please remove the repository from the cluster and add it again to make the cluster recover the known state of the repository from its physical contents.

then I removed and re-added the repository which lead to all snapshots being gone and they haven't come back since.

The error also disappeared.

Ill detail what I did exactly with the two clusters:

Created a new cluster and connected it to the S3 that's being used as a repository by the old cluster. The same S3 is now a repository on both clusters.
I proceeded to upload a snapshot from the new cluster, wanting it to appear on the old cluster. It did not appear.
Then at some point I got the above error on the old cluster leading to me removing and re adding the s3 repository on both clusters.
None of the clusters can discover any of the snapshots.

DavidTurner · March 19, 2024, 8:38am

Are you sure you've configured the repository the same as before, with the same bucket and base path and so on? If so, unfortunately if ES cannot list the snapshots there then it won't be able to restore anything. But I would not expect having two clusters writing to the repo to do such comprehensive damage to the repository so quickly.

Tariq_Yahya · March 19, 2024, 2:45pm

Thanks a lot David, I find out after your tip that my path was incorrect. I corrected it now and I can see all the snapshots.

The issue now is that when I try to restore them I get this error:

Unable to restore snapshot

[*****:application-logs-2023.01.01-90engxybs_qfy_lrnykgoa/bvczrwHNRq2KVZljI75wWw] is missing

DavidTurner · March 19, 2024, 3:32pm

Right, that's more like the kind of error I'd expect after a repository had multiple writers. You'll need to choose a different snapshot to restore.

Tariq_Yahya · March 19, 2024, 4:00pm

Recovering said snapshot is not possible?

DavidTurner · March 19, 2024, 6:34pm

Likely not. Depends on the exact details of the error message (and stack trace) but this is the sort of thing that can happen when one cluster deletes this snapshot while another cluster is writing to the repository.

system · April 16, 2024, 6:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Restore from S3 to multiple ES clusters in parallel Elasticsearch	5	359	February 2, 2021
Snapshot in s3 bucket not seen by other cluster Elasticsearch snapshot-and-restore	4	859	November 5, 2021
Multiples snapshot repositories Elasticsearch	3	400	January 31, 2019
One snapshot repository for different clusters? Elasticsearch	3	561	April 1, 2019
Elasticsearch snapshot and restore in S3 Elasticsearch	5	676	October 26, 2017

Configured S3 repository on multiple clusters with write-access, data corrupted

Related topics