I am trying out the snapshot/restore feature in ES. I did the following
(1) Registered one of my local directory as repo
(2) Took snapshot of indexed data using command PUT /_snapshot/repo1/snapshot1
(3) Made modifications and re-indexed my original data
(4) Took another snapshot PUT /_snapshot/repo1/snapshot2
(5) (a) Deleted my index that had all the data. restored snap2 -
(b) Deleted my index that had all the data. Deleted snap 1. Restored snap2 -
If i understood it right , snap2 is by default , an incremental snapshot. Both for 5 (a) and 5 (b) i see all my data as expected. My question here is ,
Should restoring snap2 not just restore the delta it is aware of and how is it able to restore all data ? Even when i had deleted snap1 how was it able to recover all files ? Could someone explain the incremental concept better ?
Basically, i was trying out this scenario, to figure out if i can implement a cleanup policy that cleans up those snapshots created before a particular date
As per your clarification, if i try to snapshot a second time in "snapshot1" ,i see this exception
{
"error": "InvalidSnapshotNameException[[repo1:snapshot1] Invalid snapshot name [snapshot1], snapshot with such name already exists]",
"status": 400
}
How do i mention that this is incremental ? Am i missing something ?
So if i understood it right, the expectation is ,
-> if I restore snap2 i should see only new segments(my observation was different though, i could see both old and new segments) ?
In my initial post i had mentioned the steps i followed. And this was my Q:
Should restoring snap2 not just restore the delta it is aware of and how is it able to restore all data ? Even when i had deleted snap1 how was it able to recover all files ?
Snapshot1 pushed segments 1, 2, 3 in repository X
Snapshot2 needs to backup segments 1, 2, 3, 4 (4 contains the new data). But repository X already has segments 1, 2, 3. So the _snapshot action will only copy for snapshot2 the segment 4.
When you delete the snapshot1, as snapshot2 also has links to segments 1, 2, 3 those files are not removed.
When you restore snapshot2, you restore all files linked: segments 1, 2, 3 and 4.
I don't really know. I mean that files which have been used by snapshots are stored in a metadata file IIRC.
The question is more: "what do you exactly want to know"?
You have to understand that snapshot backups "shards" and not individual documents. So the difference you are looking after won't tell you which documents have been backup'ed on the second run.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.