S3 Curator delete process slow - Is this a bad idea?

We have a backup process using curator that is backup to S3, unfortunately, the delete job for the snapshots has not been processed. Deleting from curator is very slow for s3 and reading on the forums this is caused by the number of snapshots and should get faster as we delete more. Currently, though, this is a years worth of data and about 16tb of data.

I am wondering if it would be better to create a new S3 bucket and create a new snapshot location for that bucket. (Set the delete curator correctly this time)

Run the backups for the next 20 days to get back to a standard coverage of backups then.

Then delete the snapshot location in elasticsearch which I don’t believe will delete the data in S3 (but I am not sure). I.e delete /_snapshot/myoldbackuplocation

Then set S3 lifecycle rules to delete the data off the old bucket.

Is this an acceptable solution? Or will Elasticsearch still have problems with me removing the snapshot? From what I can read this should work and potentially is the quickest and cheapest way to solve the problem.

Maybe the second dumb question of the day is are _snapshot locations treated individually in elasticsearch so they don’t look at another _snapshot location and mix the backups files for restoring the data in the future?

Any advice or clarity would be great, many thanks.

I am running on elasticsearch 6.8.2.1 at present so pretty old cluster setup.

Yes, I 100% recommend this exact process.

Much of the slowness of snapshot delete has long since been corrected in Elasticsearch 7.x (over the course of several releases), but you are definitely correct. It's better to create a new bucket, new repository to that bucket, and set up procedures to work better there if you're still on 6.x.

Mostly sounds good, except this bit:

Then set S3 lifecycle rules to delete the data off the old bucket.

The blobs in the repository are all inter-dependent but lifecycle rules don't know about these dependencies, so the first time the lifecycle rule deletes a blob it will unrecoverably break the repository. I'd recommend just deleting the repo contents completely when you don't need them any more.

I am running on elasticsearch 6.8.2.1 at present so pretty old cluster setup.

Yeah there's definitely been improvements to delete speed in more recent clusters. Also SLM is now built in so you don't need a separate curator process to do periodic backups any more.

Thank you both for your response. @theuntergeek and @DavidTurner

David, I am planning on removing the old snapshot repo with a command of

delete /_snapshot/myoldbackuplocation

/_snapshot/myoldbackuplocation -> Points to bucket1 on s3

Create a new snapshot location pointing to a different bucket.
/_snapshot/newonebackuplocation -> Points to bucket2 on s3

Then my lifecycle rule with empty the bucket1 and clean it up.

My assumption are that _snapshot/newonebackuplocation is independent of the old one /_snapshot/myoldbackuplocation

So basically as long as my assumption is correct that _snaphot repos are independent of each other then this should work. But that is my core question, which is sounds like you are saying they are not @DavidTurner ?

Just to clarify also I am not planning on using lifecycle rules on the new bucket / repo location. I am planning on using curator as you normally do. I understand if I ran lifecycle rules on my bucket2 with the new bucket it would be tears. :wink: I just need to clean the bucket out of 16tb and delete it.

Don't "clean" myoldbackuplocation. Wait for newonebackuplocation to reach your retention period, then delete the myoldbackuplocation repository, and then just wholly delete bucket1 using S3 (not Elastic).

1 Like

@theuntergeek that is the current plan of attack, on the idea that elasticsearch snapshot repos are independent of each other which I believe they are. But I am not 100% :wink:

They are.

@DavidTurner / @theuntergeek thank you so much for your help. I live in London if I ever bump into you both I will buy you a beer/coffee/doughnuts :wink:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.