If I create a snapshot daily, and there will be 365 snapshots a year later. Does it mean that only the first one is a complete one, and all other 364 are just incremental snapshots, and each is based on previous one? In that case, We can't purge any old snapshot.
Is it possible to forcely create a snapshot with a complete copy of data?
I used /_snapshot/my_backup/_all to get all snapshots. I couldn't tell which one is a snapshot with a complete copy of data from the response. How can I know which one is a complete one? Thanks.
@TimV Thanks a lot for the quick response, which makes sense to me.
But I am still interested in how the snapshot deletion API works, because I need work out a plan to delete the old data. Can you please provide more detailed info (work flow or algorithm?) on the deletion API?
Another thing is that I have to use the storage service provided by our own cloud infrastructure, so I have to implement a new elasticsearch plugin to support our own storage service. Are there any online documents/guides?
I don't understand. The delete API will delete the old data for you - you just need to decide when you aren't interested in keeping that snapshot any longer.
If you delete a snapshot that is still sharing data with another (presumably newer) snapshot, then the shared data will not be deleted.
I don't believe so, your best path will be to review & learn from the official repository plugins and then ask questions if you need more info.
Let me ask this question another way. I just created a snapshot named "snapshot_1", afterwards fed more data into elasticsearch cluster. Then I generated another snapshot named "snapshot_2". It's for sure that the first snapshot "snapshot_1" had a complete copy of data because it's the very first one. Regarding "snapshot_2", it's most likely an incremental one based on snapshot_1. At last, I deleted "snapshot_1", and successfully. So I am a little confused why "snapshot_1" could be deleted successfully, when it's (most likely) depended on by snapshot_2? Does it mean that the snapshot_1 was actually not deleted, even I got a successful resposne on deletion?
It seems that elasticsearch only supports the following four repository types. But I need to use our own cloud storage service. So I ask for guides on how to implement a new elasticsearch plugin. Did I miss anything? Thanks!
Shared filesystem, such as a NAS
Amazon S3
HDFS (Hadoop Distributed File System)
Azure Cloud
The snapshot was deleted. You cannot restore from that snapshot anymore. It may not have deleted all the files in use by that snapshot, but it doesn't claim to do so.
Note, snapshots are not incremental in the simplest sense of just writing deltas from the previous backup, they follow the underlying lucene segments, which means merge events on the underlying indices will be reflected in the snapshots.
And my answer was that we don't have docs for that, and your best option is to look at the code for the existing plugins, and use that to guide you.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.