We are using Elasticsearch 1.7.2 in Prod and having 3 master nodes, 3 data nodes & 1 client node. ES cluster has 3 indices such as,
Index name Primary size Shards
es_chr, 2 GB 1
es_cval 173 GB 6
es_item 850 GB 18
Yesterday we took snapshot in Production cluster and I can see in repository snapshot files/folders were generated with based on the number of shards allotted and then we restored the indices in DR(Di aster Recovery) cluster for backup purpose. Restoring in DR cluster, I manually moved the snapshot files from PROD to DR server (through SCP) and I started restoring it and process were completed well. Suppose today If i am going to take snapshot of index es_item, data might be changed or new data will be added in index.
Should I again manually need to move snapshot files from PROD to DR server for backup?
(or)
Is there any other workaround to move incremental data?
How to find the difference between initial & incremental snapshot?
We don't have a common repository for PROD & DR server. For PROD server we created NAS mount for repository to take backup but for the DR server location were different so that we can't rsync the repository to DR cluster. So we created local repository in DR server and manually moving snapshot files from PROD repository to DR local repository.
Please give your feedback to overcome this problem.
But you need to add all files available in the PROD repository to the DR repository.
I don't know how you can "diff" the two directories... May be sorting by date and only copy the "new" files?
I took second snapshot after adding some files and try to find the difference. I found that some files also got removed and some got modified. But if I am adding documents, why some files get removed from the backup?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.