This cron job runs daily which backs up my index to AWS S3, each day the
snapshot has a different name.
I want to make sure that I am not duplicating a 10GB index for example
everyday in S3? Does it look at my index from yesterday and only index the
changes? What if there were no changes, What does it mean for todays
snapshot vs yesterday's snapshot (Is there a duplicate?)
But let's say you have saved old Lucene segments and that old segments has been merged in the meantime to a new bigger one, the next snapshot will copy the new BIG segment and remove the old ones.
It means that old data will be copied twice in this scenario.
Makes sense?
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
This cron job runs daily which backs up my index to AWS S3, each day the snapshot has a different name.
I want to make sure that I am not duplicating a 10GB index for example everyday in S3? Does it look at my index from yesterday and only index the changes? What if there were no changes, What does it mean for todays snapshot vs yesterday's snapshot (Is there a duplicate?)
Thanks, it makes sense in this case. I don't think I can prevent something
like that from happening?
On Wednesday, August 6, 2014 1:29:40 PM UTC-4, David Pilato wrote:
Well. It is incremental.
But let's say you have saved old Lucene segments and that old segments has
been merged in the meantime to a new bigger one, the next snapshot will
copy the new BIG segment and remove the old ones.
It means that old data will be copied twice in this scenario.
Makes sense?
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 6 août 2014 à 18:36, IronMike <sabda...@gmail.com <javascript:>> a
écrit :
This cron job runs daily which backs up my index to AWS S3, each day the
snapshot has a different name.
I want to make sure that I am not duplicating a 10GB index for example
everyday in S3? Does it look at my index from yesterday and only index the
changes? What if there were no changes, What does it mean for todays
snapshot vs yesterday's snapshot (Is there a duplicate?)
Thanks, it makes sense in this case. I don't think I can prevent something like that from happening?
On Wednesday, August 6, 2014 1:29:40 PM UTC-4, David Pilato wrote:
Well. It is incremental.
But let's say you have saved old Lucene segments and that old segments has been merged in the meantime to a new bigger one, the next snapshot will copy the new BIG segment and remove the old ones.
It means that old data will be copied twice in this scenario.
Makes sense?
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
This cron job runs daily which backs up my index to AWS S3, each day the snapshot has a different name.
I want to make sure that I am not duplicating a 10GB index for example everyday in S3? Does it look at my index from yesterday and only index the changes? What if there were no changes, What does it mean for todays snapshot vs yesterday's snapshot (Is there a duplicate?)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.