Delete S3 snapshots through API to minimize the size of s3 bucket

Hello Everyone ,

I am currently having issues in managing the old snapshots in s3 as the bucket size is getting increased if i have to store old snapshots. We have tried the option of s3 lifecycle policies and that is not working as data getting corrupted.I have tried deleting through api but its not cleaning the snapshot on s3 and the size remains same on s3 after delete operation. Please provide some options to clean s3 old snapshots to reduce the size if bucket. Have looked at curator but we are using very old version of es which is 1.3 version. Please give some options.

I am not aware of the S3 repository plugin not being able to delete data when snapshots are deleted. I wonder if this is due to you running such an extremely old version. I would really recommend that you upgrade to a more recent version as the one you are using is almost 5 years old and has been EOL for a long time.

Hi @areddy7021 ,

unfortunately older versions of Elasticsearch had trouble cleaning up unreferenced data in repositories correctly in many cases.
The issue was eventually fixed in https://github.com/elastic/elasticsearch/pull/42189. I'm afraid there isn't a solution for users of versions prior to 7.4 though.

1 Like

Thank you everyone for the inputs but i am just looking for the solution if we can have snapshot deletion irrespective of es version . I mean if there is any way that we can reduce the size of es snapshots in s3 buckets to avoid the huge billings on size which anywayz we dont need any old snapshots. I mean i can put life cycle policies but those are impacting the restoration , looking for solution irrespective of es version approach where we can manage the old snapshots in s3 buckets ..trying to delete in safe way which should not impact in restore.

Sorry @areddy7021 there's no such thing. It took significant development effort to implement a safe algorithm to clean up stale data in the repository. There is no reasonable way to manually implement such a cleanup on top of ES.

Ok thank you and i have used delete api to delete the particular snapshot but its not cleaning on S3 . is that the thing expected though it deleted snapshot on es box but not on s3 ?

I have used this endpoint to delete snapshot version , its getting deleted on es instance but when i see on s3 ..it remains as is.

Is this not getting back ported to 6.8?

No it's not unfortunately. It's a pretty complex change and we eventually deemed it too risky to backport.

@areddy7021 the delete will have partially failed (it likely deleted snapshot from repository metadata but not the actual data associated with it) which leads to stale data that is never cleaned up.
This is a known issue in older versions.

So the delete api is expected to work that which wont clean up on actual backup store ..just it is going to clean up on es instance level ..am i correct ? the only reason is worried about the size of the es back up in s3 which goes high and high eventually if we wont clean up on regular basis.

hmm ..got it. so does curator work in this path to delate specific snapshot ?

@areddy7021 the problem isn't that the delete logic does not work in principle.
The problem is, that a partially failed delete can leave behind files that are never cleaned up and might prevent files from being cleaned up in subsequent deletes.
Curator won't help you here either.

If you absolutely cannot upgrade, the best work-around I see would be to move to a fresh repository (different bucket or different sub-path) periodically and delete the old repository once you don't need any of the snapshots in it anymore.

we dont have an upgrade option right now and will take a look on the alternative what you provided ..but thanks a lot such a great response from your team.if any chance ..pls take a look on my issue :slight_smile:

Can i have the new folder in same bucket and push the new backup snapshot to new folder and delete the old content ? we are targeting towards not to have our es service to be interrupted any time for this backup. i mean no restart to be done.

@areddy7021 yes you can set different base_path settings for the new repositories to snapshot to separate paths in the same S3 bucket.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.