Copying data between clusters

Rafal_Kuc_3 · March 23, 2012, 4:04pm

Hello,

We have two clusters we want to copy data between. We want to copy data
from cluster1 (0.18.5) to cluster2 (0.19.1). Is this possible ?

Thanks,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch

Frederic · March 23, 2012, 7:59pm

Hi Rafał,

If you want to do so in real time, having both cluster updated together, I
guess you may use this plugin:

Besided that I think you'll have to implement your custom solution as there
is no notification/changes API in ES yet. You may take a look at this post
about this feature (quite long)
https://groups.google.com/forum/?fromgroups#!searchin/elasticsearch/changes$20API/elasticsearch/S3fSfr4Cz3g/fyse5X4ofuYJ

For only creating a new 0.19 cluster with current 0.18 data, AFAIK you only
need to backup 0.18 data, following the steps mentioned here
https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/BtjDXzKdAIk
(disable fllushing, force flush, copy, renable flushing). These steps are
reproduced by a Karussell scritps (Backup ElasticSearch with rsync · GitHub)

Then you use the sopy of your data as the 'data' directory for you 0.19
server. When you start it for first time, it should update the info to the
new format.

Hope it helps (meanwhile you get a response from someone more experienced
on ES at least

Cheers,

On Friday, 23 March 2012 13:04:48 UTC-3, Rafał Kuć wrote:

Hello,

We have two clusters we want to copy data between. We want to copy data
from cluster1 (0.18.5) to cluster2 (0.19.1). Is this possible ?

Thanks,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch

Rafal_Kuc_3 · March 23, 2012, 11:12pm

Thanks Frederic,

The 0.19.1 is not clear and so I can't just flush the 0.18.5, copy the data
directory between nodes and then start 0.19.1 as I'll loose the data in the
new cluster this way. What I would like to achieve is copy the data from
0.18.5 to 0.19.1.

I was thinking about connecting those two clusters together to form a new
one, increase the replica count to be equal to the number of nodes and wait
for ES finish copying data. In theory it seems doable, but I don't know if
there are any practical obstacles that I should be aware of, like version
incompatibilities.

Regards,
Rafał

jprante · March 24, 2012, 5:42pm

You can't connect two clusters 0.18.5 and 0.19.1. You will receive
something in the 0.18.5 logs like

[2012-03-24 ...][WARN ][discovery.zen.ping.multicast] [Spyne] failed to
read requesting node from /...:...
java.io.IOException: Expected handle header, got [-9]
at
org.elasticsearch.common.io.stream.HandlesStreamInput.readUTF(HandlesStreamInput.java:63)
at
org.elasticsearch.cluster.ClusterName.readFrom(ClusterName.java:63)
at
org.elasticsearch.cluster.ClusterName.readClusterName(ClusterName.java:58)
at
org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing$Receiver.run(MulticastZenPing.java:363)
at java.lang.Thread.run(Thread.java:722)

This seems caused by slight changes in the header structure of the
multicast pings.

Jörg

On Saturday, March 24, 2012 12:12:13 AM UTC+1, Rafał Kuć wrote:

Thanks Frederic,

The 0.19.1 is not clear and so I can't just flush the 0.18.5, copy the
data directory between nodes and then start 0.19.1 as I'll loose the data
in the new cluster this way. What I would like to achieve is copy the data
from 0.18.5 to 0.19.1.

I was thinking about connecting those two clusters together to form a new
one, increase the replica count to be equal to the number of nodes and wait
for ES finish copying data. In theory it seems doable, but I don't know if
there are any practical obstacles that I should be aware of, like version
incompatibilities.

Regards,
Rafał

Rafal_Kuc_3 · March 24, 2012, 6:10pm

Hi,

That's what I was afraid of. Thanks Jörg.

Rafał

W dniu sobota, 24 marca 2012, 18:42:57 UTC+1 użytkownik Jörg Prante napisał:

You can't connect two clusters 0.18.5 and 0.19.1. You will receive
something in the 0.18.5 logs like

[2012-03-24 ...][WARN ][discovery.zen.ping.multicast] [Spyne] failed to
read requesting node from /...:...
java.io.IOException: Expected handle header, got [-9]
at
org.elasticsearch.common.io.stream.HandlesStreamInput.readUTF(HandlesStreamInput.java:63)
at
org.elasticsearch.cluster.ClusterName.readFrom(ClusterName.java:63)
at
org.elasticsearch.cluster.ClusterName.readClusterName(ClusterName.java:58)
at
org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing$Receiver.run(MulticastZenPing.java:363)
at java.lang.Thread.run(Thread.java:722)

This seems caused by slight changes in the header structure of the
multicast pings.

Jörg

On Saturday, March 24, 2012 12:12:13 AM UTC+1, Rafał Kuć wrote:

Thanks Frederic,

The 0.19.1 is not clear and so I can't just flush the 0.18.5, copy the
data directory between nodes and then start 0.19.1 as I'll loose the data
in the new cluster this way. What I would like to achieve is copy the data
from 0.18.5 to 0.19.1.

I was thinking about connecting those two clusters together to form a new
one, increase the replica count to be equal to the number of nodes and wait
for ES finish copying data. In theory it seems doable, but I don't know if
there are any practical obstacles that I should be aware of, like version
incompatibilities.

Regards,
Rafał

Topic		Replies	Views
Copy live data between clusters Elasticsearch	3	379	June 29, 2020
Copying indices on same cluster Elasticsearch	6	383	July 6, 2017
Copy Data from ClusterOld to ClusterNew into the same index Elasticsearch	7	6571	July 5, 2017
Replicate all changes from one cluster to another Logstash	2	980	July 6, 2017
Copy data from one elasticsearch to another elasticsearch when changes happen in one Elasticsearch	1	336	December 11, 2018

Copying data between clusters

Related topics