I was trying to understand the concept of incremental backups. So, here is a scenario:
Create an initial snapshot called snap1 (backed up in First Month)
Create another snapshot in the same repo snap2 (backed up in Second Month) (most of the documents are newly indexed since the occurrence of snap1)
Create a snap3 (just to be sure)
Now, when I recover data from snap2, will I have documents that were there in snap1 too? Ideally, I don't want the old data being restored again.
Restoring a snapshot should just restore the documents that were visible when you took the snapshot.
Snapshots work by copying all of the files that make up the index into the repository and restoring them when you need them. They save space by not copying files that haven't changed since the last snapshot. They use reference counting to know when it is ok to delete the files. This works out well in an index that you are just adding documents to because most of the index files are write once. If you do lots of deletes or updates then the backups will be able to save less space.
So, does this mean deleting an old snapshot will delete all the data backed up in that snapshot?
That doesn't seem to be the case when I run a test case. I also see data from snap1 being restored even when I have deleted it.
I am trying to restore a newer snapshot, but the old documents are coming in too.
When you delete a snapshot, all files which are not used anymore by another snapshot (segments basically) are removed.
If another snapshot still point to this segment, then data is not removed.