Delete some snapshots on S3


(olivier hodac) #1

Hello,

I have created a snapshot automation to s3 with a crontab

0 */3 * * * curl -XPUT -u elastic:aaaaaa "http://localhost:9200/_snapshot/s3_repository/snapshot_$(date +\%Y\%m\%d_\%H)"

I can see my S3 billing going up, and I want to cron the deletion of the old snapshots.
I have started doing

curl -XDELETE -u elastic:aaaaa "http://localhost:9200/_snapshot/s3_repository/$old_snap

but it is veeery slow. What is the good practice to manage the snapshots? I just need a set of snapshots in case of crash or admin error. It means like 2 weeks retention

best


(Tek Chand) #2

@dao, You can use life cycle management of S3. In Life cycle management of S3 you can set the time period after how much time your data should be deleted from S3.

Thanks.


(olivier hodac) #3

Correct, but how do I know what file of the bucket I can delete? I imagine that some the files created by the snapshots are linked in a certain way (I have understood that the snapshots are incremental)? If I delete a file, I may corrupt some snapshots, right?


(Tek Chand) #4

@Dao, As far as i can tell we can delete any snapshots and it will not have any impact on other snapshot.

You can use elasticsearch curator also to delete the snapshots.

Please refer the below link for curator config:
https://discuss.elastic.co/t/deleting-old-snapshots/134085

Please refer the below link which saying that we can keep those snapshots which we want and delete the rest.

https://support.cloudbees.com/hc/en-us/articles/115000592472-Managing-snapshots-of-your-Elasticsearch-indices-

Thanks.


(Tek Chand) #5

@Dao, As per my understanding, Indices are created with timestamp means one indices per day for single index pattern. So all the indices are independent to each other. You can delete any indices or snapshot and will not have any impact on the other indices or snapshots.

Thanks.


(Christian Dahlqvist) #6

Do not ever delete files directly from a snapshot repository unless you are deleting the complete repository. Instead use the APIs to delete snapshots. This blog post is old, but still describes what goes on behind the scenes and how it works quite well.


(Tek Chand) #7

@Christian, Thank you for your response.

Yes... its will be quite difficult to identify the files for a specific snapshot.

As i have read that post and snapshots are incremental. So i have one question for you, Can we delete any snapshot at any time using API or using curator?

Can we set life cycle on our S3 bucket so it will delete the snapshot from bucket? But as per my knowledge it will be like deleting a file directly from repository. So we should not use life cycle on S3 to delete the snapshot.

Thanks.


(olivier hodac) #8

Therefore, back to my initial question: deleting a snapshot is veeery slow (using the APIs)

so, is there an alternate solution?


(olivier hodac) #9

There is no solution, then?


(Christian Dahlqvist) #10

@Tek_Chand You can create any snapshot at any time. Remaining snapshots will make sure they have access to all segments required to report data as of that point in time.

You should NOT delete files directly from a bucket as old segments may still be in use, which would potentially corrupt newer snapshots.


(Christian Dahlqvist) #11

@dao If you have a lot of snapshots there can be a lot of processing determining exactly which segments that need to be kept. You could perhaps set up a new repository and switch to this. Be sure to trim old snapshots regularly to keep the size down. This will also make it easier to process. Once you no longer need the old repository you can delete it completely.


(olivier hodac) #13

OK, I'm moving to this process. here is the current setup, if it can help

  - name: Cron snapshots
    cron: name="elk-snapshots" minute="0" hour="*/3" job="curl -XPUT -u snap:aaaa \"http://localhost:9200/_snapshot/s3_bigdata_repo/snapshot_$(date +\%Y\%m\%d_\%H)\""
    when: backup==True
    tags: snapshot


  - name: Delete 15-day-old cron snapshots
    cron: name="elk-snapshots-delete-old" minute="40" hour="*/3" job="curl -XDELETE -u snap:aaaa \"http://localhost:9200/_snapshot/s3_bigdata_repo/snapshot_$(date --date=\"15 day ago\" +\%Y\%m\%d_\%H)\""
    when: backup==True
    tags: snapshot

(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.