This is the old datacenter awareness issue again. I've done a thorough search and haven't found any good solution to moving an ES cluster to a new DC, so I'm hoping some one here has a viable method!
So we're moving datacenters and I've the new ES cluster up at the new DC (let's call it DC2). Both the old (DC1) and new (DC2) are running the same ES version (0.90.13), but have a different number of datanodes, as well as DC1 datanodes are on CentOS and DC2 datanodes are Windows 2012R2. Yes, I know, none of this is ideal, but it's what I've been given, so...
I've been testing tools like ElasticDump and Knapsack, and they're fine for our smaller indices, but the one critical index is 100G, and neither tool would be able to move that data within the outage window, not to mention we don't even have the bandwidth between the DCs to make that work.
So I've a couple of options I'm working on:
Use cluster awareness
I've been playing with the cluster awareness settings, but I'm not sure it's going to get me what I want. Ideally, I'd be able to add the new DC2 nodes to the DC1 cluster, and set it so that the DC2 nodes only ever get replicas, they should never have primary shards. At the same time, I do not want to have less than a replica factor of 1 on the DC1 cluster as it's still our production system and I cannot reduce the redundancy of it.
I haven't been able to come up with a set of awareness settings that would result in this scenario, or perhaps the correct sequence to do so. Is this even possible to do via the 0.90.13 settings?
(And someone really needs to rewrite the section of the manual on forced awareness, I've read it a dozen times and it still doesn't make sense to me. )
rsync shards to new DC
This is complicated by the DC2 datanodes being W2012R2, but let's say I can get my hands on a CentOS node in DC2. If I rsync the shards over the next coming weeks continuously, my delta during the outage should be small enough to move quickly. Can I simply start that node up cold and have it recognize the rsync'ed shards, or would I have to some how notify it that it has shards locally?
Why are you on such an old version?! I'd take the chance to upgrade if nothing else.
You are in a bind here because ES is not built for cross-DC deployments, and you cannot force one side to have primaries and the other side replicas without manually assigning things, which is a hassle and will change as ES reallocates things.
If you were to upgrade to a more recent version of Elasticsearch before the move, you may be able to use the snapshot and restore feature to perform the migration to the new cluster. Depending on how heavily your data set is updated, it might be possible to move the bulk of your data over in an initial snapshot and then periodically create and migrate additional snapshots that account for the delta since the last one was created. This may allow you to switch over with relatively little downtime without having to directly connect the two clusters and would also allow you to test the new cluster and ensure the success of the data migration before the switch.
So the ES upgrade was planned for this migration, but then the monkey wrench of all the nodes being Windows was introduced, and I don't believe that changing that many variables at once is a good idea. So already I've got to contend with a lot of changes: JDK version (7->8), the datanode OS (Centos->2012R2), dedicated physical nodes -> VMs, dedicated disks -> SAN, seven nodes to sixteen, 24GB RAM -> 16GB per node... Adding in an irreversible ES update to this mix is not my idea of a good time I'd have preferred to do this differently, but that's not in the cards at the moment.
And thank you for confirming what I suspected about the cluster awareness settings. I managed to get in a couple of test runs yesterday to see how a simulated cluster behaves with different cluster awareness settings, and it was not reassuring. Maybe it's my setup, or the 0.90.13 version, but I had many tests fail when shards would simply never reallocate. The only solution I found was to set replicas to zero, and let the cluster clean up the orphaned shards.
I looked at that functionality, but without upgrading the existing cluster I've no easy way to test it, so I'm still in the same boat, where I have to move the data to the new cluster, and that new cluster happens to be in another datacenter.
I've seen posts from people with mixed version clusters. Is that a possible upgrade path? I do need to figure out how to upgrade ES, but the DC move will happen before I can do that, and they already started pulling equipment from the old DC so even if I wanted to upgrade the existing cluster in place, I can't.
If I get the new cluster up on 0.90.x, and can get a equivalent number of new datanodes running 1.7, can I simply join them to the 0.90 cluster, reassign the shards (manually if necessary), then do a rolling node-by-node upgrade? Then once all the datanodes are replaced with 1.7 datanodes, do the same with a rolling node update to the dedicated master nodes?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.