We use elasticsearch snapshot API to take backup of ES data daily and also regularly copy the entire content from the snapshot repository to another remote system.
Now that we have huge amount of data in ES (in TBs), and the daily increment to actual data is very less (order of few GBs), we do not think it is feasible to transfer TBs of backed-up data to the remote system everyday.
As ES snapshots are incremental: each snapshot of an index only stores data that is not part of an earlier snapshot --
Is it possible to clearly identify only the incremental changes to the snapshot repo as part of creation of the snapshot?
This would help us to only transfer the incremental snapshot content to the remote system and not the entire thing.
If yes, can you please help us understand how that can be achieved.
If point 1 can be achieved, how do you think restore would work? Can the incremental snapshots be combined and placed into the snapshot repo for restore to ES?
Note: Version used - elasticsearch-oss:7.0.1
Any pointers/suggestions would be appreciated.