I have theoretical question about working of ES snapshots. I know snapshots are incremental.
We have one cluster in which we are keeping the data forever, we take the snapshot on daily basis, we want to keep 30 days snapshot retention, now after 30 days when snapshot rotate then whether it does the merging of data to existing snapshot so that it avoid taking the snapshot of the data again? or it will simply delete the snapshot after retention and new snapshot again will take the backup of data?
Each snapshot is a full snapshot of all the data in the cluster. The incremental aspect you are referring to is that any segments within a shard that have already been snapshotted will not be copied again, which means only new data is copied over when the snapshot is taken. The snapshot however links to and reuses already stored segments, so these will not be deleted if the snapshot where these were originally copied is removed.
Let's say that index created on indices-20230201 deleted on 8th Feb (weekly policy), snapshots after 8th Feb will not have reference for this snapshot, if we delete snapshots older than 8th Feb then reference of this index will be gone from all snapshots.. right?
If indices-20230201 retained forever then even if we delete all the snapshots excepts latest one we can still restore the index. am i correct in my understanding?
Each snapshot contains the data that is available in the cluster when it was taken. If an index is deleted, snapshots generated after that time will no longer contain that index. If the oldest snapshot you have was taken after the index was deleted, you no longer have access to that index.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.