We have a backup process using curator that is backup to S3, unfortunately, the delete job for the snapshots has not been processed. Deleting from curator is very slow for s3 and reading on the forums this is caused by the number of snapshots and should get faster as we delete more. Currently, though, this is a years worth of data and about 16tb of data.
I am wondering if it would be better to create a new S3 bucket and create a new snapshot location for that bucket. (Set the delete curator correctly this time)
Run the backups for the next 20 days to get back to a standard coverage of backups then.
Then delete the snapshot location in elasticsearch which I don’t believe will delete the data in S3 (but I am not sure). I.e delete /_snapshot/myoldbackuplocation
Then set S3 lifecycle rules to delete the data off the old bucket.
Is this an acceptable solution? Or will Elasticsearch still have problems with me removing the snapshot? From what I can read this should work and potentially is the quickest and cheapest way to solve the problem.
Maybe the second dumb question of the day is are _snapshot locations treated individually in elasticsearch so they don’t look at another _snapshot location and mix the backups files for restoring the data in the future?
Any advice or clarity would be great, many thanks.
I am running on elasticsearch 6.8.2.1 at present so pretty old cluster setup.
Much of the slowness of snapshot delete has long since been corrected in Elasticsearch 7.x (over the course of several releases), but you are definitely correct. It's better to create a new bucket, new repository to that bucket, and set up procedures to work better there if you're still on 6.x.
Then set S3 lifecycle rules to delete the data off the old bucket.
The blobs in the repository are all inter-dependent but lifecycle rules don't know about these dependencies, so the first time the lifecycle rule deletes a blob it will unrecoverably break the repository. I'd recommend just deleting the repo contents completely when you don't need them any more.
I am running on elasticsearch 6.8.2.1 at present so pretty old cluster setup.
Yeah there's definitely been improvements to delete speed in more recent clusters. Also SLM is now built in so you don't need a separate curator process to do periodic backups any more.
David, I am planning on removing the old snapshot repo with a command of
delete /_snapshot/myoldbackuplocation
/_snapshot/myoldbackuplocation -> Points to bucket1 on s3
Create a new snapshot location pointing to a different bucket.
/_snapshot/newonebackuplocation -> Points to bucket2 on s3
Then my lifecycle rule with empty the bucket1 and clean it up.
My assumption are that _snapshot/newonebackuplocation is independent of the old one /_snapshot/myoldbackuplocation
So basically as long as my assumption is correct that _snaphot repos are independent of each other then this should work. But that is my core question, which is sounds like you are saying they are not @DavidTurner ?
Just to clarify also I am not planning on using lifecycle rules on the new bucket / repo location. I am planning on using curator as you normally do. I understand if I ran lifecycle rules on my bucket2 with the new bucket it would be tears. I just need to clean the bucket out of 16tb and delete it.
Don't "clean" myoldbackuplocation. Wait for newonebackuplocation to reach your retention period, then delete the myoldbackuplocation repository, and then just wholly delete bucket1 using S3 (not Elastic).
@theuntergeek that is the current plan of attack, on the idea that elasticsearch snapshot repos are independent of each other which I believe they are. But I am not 100%
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.