I am working on to try and use Snapshot and Restore feature to push index data from one running/live cluster to another live cluster without needing to have any downtime in destination cluster. I have spun up 2 multi-node clusters in my local machine and was able to start them up and register a common repository with both clusters. Then I took a snapshot from source cluster and now looking to restore it into destination cluster, however I run into because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name
.
Note that since destination cluster to which I am trying to restore to is also a live cluster and is taking user query traffic, can not close the same index in live cluster before restoring it.
Now, the index that I am trying to restore does have timestamp in its name and there is an alias being used to point to the index. I am really not sure if index names will be exactly same including timestamp in production systems or not. However even if they were to be not same, I do not want to rely just on the fact that names of index being restored are not same (by chance) for restore to work successfully.
Hence the question is, Is there anyway to workaround the issue of same index with same name already existing in destination cluster which is being restored from source cluster? Maybe rename index while taking a snapshot from source or before restoring it to destination. I read up that we can rename index during restore but that is after the fact that index is restored and I have tried it but still does not work.
We have also explored /clone api of Elasticsearch in order to achieve transferring of index data from one cluster to another. However, it also needs destination index to be not present/deleted first before cloning it.
More background - We are planning to design a process to transfer index data from source cluster to destination cluster which are geographically separated on a daily basis (say every night). We are aware that there is a premium feature of Elasticsearch called Cross cluster replication which replicates data from one cluster to another using Leader and Follower index concept. However, we would like try out all other options before making any decision.
There can be questions about the approach of updating live index using restore while it being updated by another process. This is a very valid question and one that needs to be thought through. However, updates outside of restore are controlled by us and we have the ability to selectively restore indices which can not be updated in live outside of restore process.