but it is veeery slow. What is the good practice to manage the snapshots? I just need a set of snapshots in case of crash or admin error. It means like 2 weeks retention
@dao, You can use life cycle management of S3. In Life cycle management of S3 you can set the time period after how much time your data should be deleted from S3.
Correct, but how do I know what file of the bucket I can delete? I imagine that some the files created by the snapshots are linked in a certain way (I have understood that the snapshots are incremental)? If I delete a file, I may corrupt some snapshots, right?
@Dao, As per my understanding, Indices are created with timestamp means one indices per day for single index pattern. So all the indices are independent to each other. You can delete any indices or snapshot and will not have any impact on the other indices or snapshots.
Do not ever delete files directly from a snapshot repository unless you are deleting the complete repository. Instead use the APIs to delete snapshots. This blog post is old, but still describes what goes on behind the scenes and how it works quite well.
Yes... its will be quite difficult to identify the files for a specific snapshot.
As i have read that post and snapshots are incremental. So i have one question for you, Can we delete any snapshot at any time using API or using curator?
Can we set life cycle on our S3 bucket so it will delete the snapshot from bucket? But as per my knowledge it will be like deleting a file directly from repository. So we should not use life cycle on S3 to delete the snapshot.
@Tek_Chand You can create any snapshot at any time. Remaining snapshots will make sure they have access to all segments required to report data as of that point in time.
You should NOT delete files directly from a bucket as old segments may still be in use, which would potentially corrupt newer snapshots.
@dao If you have a lot of snapshots there can be a lot of processing determining exactly which segments that need to be kept. You could perhaps set up a new repository and switch to this. Be sure to trim old snapshots regularly to keep the size down. This will also make it easier to process. Once you no longer need the old repository you can delete it completely.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.