Data(Cluster) migration in seperate networks


(Derry O' Sullivan) #1

Hi all,

I'm working on some scalability planning at the moment for our ES
deployment and looking at a hypothetical situation where we want to move
our es data from one cluster (e.g. private network or cloud
(ec2/rackspace/other)) to another.

If i wanted to migrate from this private cluster (e.g. in my own office
subnet) to e.g. EC2 cluster, is it possible to do 'auto' migration (e.g.
set up unicast discovery on the ec2 cluster to 'find' the older instance on
my private network and allow the replica's to be created on EC2). e.g. let
ES manage the migration. I understand that i may need to increase the
number of replica's and make sure i have enough nodes in the new cluster to
meet that requirement

I understand that this option will have latency issues (i.e. would need to
keep the servers physically close/etc) but would there be any other issues?
Once the data has been replicated successfully (i.e. enough replica's exist
on new cluster), the old server cluster could then be powered down with the
data having been migrated?

Thanks,

Derry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

have you seen the scan search type at
http://www.elasticsearch.org/guide/reference/api/search/search-type/

is there anything speaking against that in your setup? Just wondering why
you want to have the complexity of connecting clusters across the internet
(with all the stability issues the internet offers).

--Alex

On Wed, Sep 18, 2013 at 4:55 PM, Derry O' Sullivan derryos@gmail.comwrote:

Hi all,

I'm working on some scalability planning at the moment for our ES
deployment and looking at a hypothetical situation where we want to move
our es data from one cluster (e.g. private network or cloud
(ec2/rackspace/other)) to another.

If i wanted to migrate from this private cluster (e.g. in my own office
subnet) to e.g. EC2 cluster, is it possible to do 'auto' migration (e.g.
set up unicast discovery on the ec2 cluster to 'find' the older instance on
my private network and allow the replica's to be created on EC2). e.g. let
ES manage the migration. I understand that i may need to increase the
number of replica's and make sure i have enough nodes in the new cluster to
meet that requirement

I understand that this option will have latency issues (i.e. would need to
keep the servers physically close/etc) but would there be any other issues?
Once the data has been replicated successfully (i.e. enough replica's exist
on new cluster), the old server cluster could then be powered down with the
data having been migrated?

Thanks,

Derry

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Derry O' Sullivan) #3

Hi Alex,

Thanks for the response.

I guess the reason i want to do this is that i want a way of moving the
data from my old cluster to a new cluster with (preferably) no downtime. If
i had a good network link, i could think about sharing the data among the
networks if testing proved it was ok but i really want to know if ES's
replication model will deal with migrating the data automatically and if
anyone has used this approach before.

Derry

On Wednesday, 18 September 2013 22:50:33 UTC+1, Alexander Reelsen wrote:

Hey,

have you seen the scan search type at
http://www.elasticsearch.org/guide/reference/api/search/search-type/

is there anything speaking against that in your setup? Just wondering why
you want to have the complexity of connecting clusters across the internet
(with all the stability issues the internet offers).

--Alex

On Wed, Sep 18, 2013 at 4:55 PM, Derry O' Sullivan <der...@gmail.com<javascript:>

wrote:

Hi all,

I'm working on some scalability planning at the moment for our ES
deployment and looking at a hypothetical situation where we want to move
our es data from one cluster (e.g. private network or cloud
(ec2/rackspace/other)) to another.

If i wanted to migrate from this private cluster (e.g. in my own office
subnet) to e.g. EC2 cluster, is it possible to do 'auto' migration (e.g.
set up unicast discovery on the ec2 cluster to 'find' the older instance on
my private network and allow the replica's to be created on EC2). e.g. let
ES manage the migration. I understand that i may need to increase the
number of replica's and make sure i have enough nodes in the new cluster to
meet that requirement

I understand that this option will have latency issues (i.e. would need
to keep the servers physically close/etc) but would there be any other
issues? Once the data has been replicated successfully (i.e. enough
replica's exist on new cluster), the old server cluster could then be
powered down with the data having been migrated?

Thanks,

Derry

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Norberto Meijome) #4

Hey Derry,
So I haven't tried this across high latent links, caveat emptor.

Assuming you have sorted out your ec2 to internal network
access/routing/addressing, unicast should work. You should look into
parameters related to timings (connect, discovery, etc) ( sorry, don't have
the full docs handy right now).

I would hope you have lots of shards , smaller chunks would make for faster
distribution I think.

Start new nodes in ec2. Something to try would be whether having a copy a
snapshot of the data ready for use , i.e., transferred to new nodes outside
of ES ( should help to bring online cold shards , if any.)

Then yes, increase #replicas. I haven't found that to be super fast.... You
can (should) tell your cluster to remove from routing specific old nodes
(hint, for multiple nodes, u pass a list of nodes /IPS). Obviously you have
to keep an eye on your performance... I don't think there is a one-size-
fits all answer.

Don't forget to have your load balancer include the new nodes if that's how
you have it structured...

Good luck,
B

Hi Alex,

Thanks for the response.

I guess the reason i want to do this is that i want a way of moving the
data from my old cluster to a new cluster with (preferably) no downtime. If
i had a good network link, i could think about sharing the data among the
networks if testing proved it was ok but i really want to know if ES's
replication model will deal with migrating the data automatically and if
anyone has used this approach before.

Derry

On Wednesday, 18 September 2013 22:50:33 UTC+1, Alexander Reelsen wrote:

Hey,

have you seen the scan search type at
http://www.elasticsearch.org/guide/reference/api/search/search-type/

is there anything speaking against that in your setup? Just wondering why
you want to have the complexity of connecting clusters across the internet
(with all the stability issues the internet offers).

--Alex

Hi Alex,

Thanks for the response.

I guess the reason i want to do this is that i want a way of moving the
data from my old cluster to a new cluster with (preferably) no downtime. If
i had a good network link, i could think about sharing the data among the
networks if testing proved it was ok but i really want to know if ES's
replication model will deal with migrating the data automatically and if
anyone has used this approach before.

Derry

On Wednesday, 18 September 2013 22:50:33 UTC+1, Alexander Reelsen wrote:

Hey,

have you seen the scan search type at http://www.elasticsearch.**
org/guide/reference/api/**search/search-type/http://www.elasticsearch.org/guide/reference/api/search/search-type/

is there anything speaking against that in your setup? Just wondering why
you want to have the complexity of connecting clusters across the internet
(with all the stability issues the internet offers).

--Alex

On Wed, Sep 18, 2013 at 4:55 PM, Derry O' Sullivan der...@gmail.comwrote:

Hi all,

I'm working on some scalability planning at the moment for our ES
deployment and looking at a hypothetical situation where we want to move
our es data from one cluster (e.g. private network or cloud
(ec2/rackspace/other)) to another.

If i wanted to migrate from this private cluster (e.g. in my own office
subnet) to e.g. EC2 cluster, is it possible to do 'auto' migration (e.g.
set up unicast discovery on the ec2 cluster to 'find' the older instance on
my private network and allow the replica's to be created on EC2). e.g. let
ES manage the migration. I understand that i may need to increase the
number of replica's and make sure i have enough nodes in the new cluster to
meet that requirement

I understand that this option will have latency issues (i.e. would need
to keep the servers physically close/etc) but would there be any other
issues? Once the data has been replicated successfully (i.e. enough
replica's exist on new cluster), the old server cluster could then be
powered down with the data having been migrated?

Thanks,

Derry

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5