I added the same S3 repository used by one of my cluster to another cluster and I forgot to give it read-only access. It states in the documentation that this action could cause data corruption :
"If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository . Having multiple clusters write to the repository at the same time risks corrupting the contents of the repository."
I need to restore this data, is this possible or is the data lost forever?
We can't offer any guarantees but you might be lucky. All reasonably recent versions of ES try really quite hard to avoid repository corruption in this situation. The only way to find out for sure is to try it tho.
Could not read repository data because the contents of the repository do not match its expected state. This is likely the result of either concurrently modifying the contents of the repository by a process other than this cluster or an issue with the repository's underlying storage. The repository has been disabled to prevent corrupting its contents. To re-enable it and continue using it please remove the repository from the cluster and add it again to make the cluster recover the known state of the repository from its physical contents.
then I removed and re-added the repository which lead to all snapshots being gone and they haven't come back since.
The error also disappeared.
Ill detail what I did exactly with the two clusters:
Created a new cluster and connected it to the S3 that's being used as a repository by the old cluster. The same S3 is now a repository on both clusters.
I proceeded to upload a snapshot from the new cluster, wanting it to appear on the old cluster. It did not appear.
Then at some point I got the above error on the old cluster leading to me removing and re adding the s3 repository on both clusters.
None of the clusters can discover any of the snapshots.
Are you sure you've configured the repository the same as before, with the same bucket and base path and so on? If so, unfortunately if ES cannot list the snapshots there then it won't be able to restore anything. But I would not expect having two clusters writing to the repo to do such comprehensive damage to the repository so quickly.
Likely not. Depends on the exact details of the error message (and stack trace) but this is the sort of thing that can happen when one cluster deletes this snapshot while another cluster is writing to the repository.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.