Connect a new node via SSH tunnel


(Matt) #1

I'm attempting to move data from one cluster to a new cluster without downtime.

So far, I have one node on the old cluster (this is a test setup) that's on a private subnet. I have a bastion host on that network that I connect through. My new cluster also has one node, and again, is behind a bastion, so inaccessible directly.

I was hoping I'd be able to create an SSH tunnel from my new cluster machine that connects to the machine in the old cluster (so something like localhost:9300 ==> remote:9300), and then be able to set the following in my elasticsearch.yml:

discovery.zen.ping.unicast.hosts: ["127.0.0.1"]
network.publish_host: 127.0.0.1
network.bind_host: 127.0.0.1 

I've tried this, but all I'm seeing is bound_address {inet[/127.0.0.1:9301]}, publish_address {inet[/127.0.0.1:9301]}

Has anyone managed to connect two clusters over an SSH tunnel with success?


(Matt) #2

Should probably also point out that this is ES 0.90.13 on both nodes.


(Mike Simos) #3

I'd probably use the following way (see 0.90.x and earlier) to backup your data:

https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-upgrade.html#backup

Then restore it to the same location on the other cluster/node.


(Matt) #4

Hi Mike! Thanks for the reply, but it's not going to be possible to take the production cluster down when it comes to rolling this out there. I basically need to reloacate shards to the new cluster then perform the switchover in a very small amount of time - too short to backup, copy and restore XXGb of data.


(Mike Simos) #5

Use stream2es then:

You can copy your indexes from one cluster to another.

$ stream2es es --source http://foo.local:9200/wiki --target http://bar.local:9200/wiki


(Matt) #6

Thanks for the link Mike - thing is, from what I see so far (correct me if I'm wrong), but this will not take care of writes / deletes that happen as the copy is happening.

The reason I was thinking of doing this via a tunnel was so that I could have both clusters connected in advance, and once synchronised, I could then remove the tunnel once the changeover occurs. This would then allow me a bit of freedom while other (DB) services synchronise too, no writes are lost, and downtime is minimised.


(Mike Simos) #7

You're probably better off creating a VPN connection between both machines. If you're really concerned about not losing any data as SSH is not the way to go. SSH sessions can hang, break or be slow for transferring large amounts of data. If you're really intent on using an SSH tunnel, then you can try creating a SOCKS proxy (ssh -D) on each machine to your jump box. Then using tsocks to wrap the elasticsearch java process. This will make it like both machines can talk to each other using the real IP for each machine over the SOCKS 5 proxy. Both machines need to be able to access port 9300 on each other. So I don't know if its really possible to use just port forwarding. But I'd expect using an ssh socks proxy with tsocks would probably work. But if you corrupt/lose your data your on your own.


(system) #8