Can anyone tell me how to start a local ES instance and perform an async
replication of the data from the cloud over to my machine? If replication
is not the right term then lets call it as data dump or whatever else fits
the bill here. Any thoughts?
But Shay's comment, "copy of the data directory of each production cluster,
and move it to development" makes sense. I'll try that out but in the mean
time any other direct answers to my question would be most welcome.
But Shay's comment, "copy of the data directory of each production
cluster, and move it to development" makes sense. I'll try that out but in
the mean time any other direct answers to my question would be most welcome.
So since its tougher to answer a question when something actually cannot
happen, I take it that there is no way to start a local ES instance and
perform an async replication of the data from the cloud over to my machine?
But Shay's comment, "copy of the data directory of each production
cluster, and move it to development" makes sense. I'll try that out but in
the mean time any other direct answers to my question would be most welcome.
Right, there is no built-in mechanism to do that. You'd have to write code
that reads from one and writes to the other yourself.
Copying the files should have worked, not sure what that problem is. There
has been some discussions about this in the list. If you search for
"backup" in mailing list archive you should be able to find the
discussions.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
So since its tougher to answer a question when something actually cannot
happen, I take it that there is no way to start a local ES instance and
perform an async replication of the data from the cloud over to my machine?
But Shay's comment, "copy of the data directory of each production
cluster, and move it to development" makes sense. I'll try that out but in
the mean time any other direct answers to my question would be most welcome.
Copying the files should have worked, not sure what that problem is.
There has been some discussions about this in the list. If you search
for "backup" in mailing list archive you should be able to find the
discussions.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
On Fri, Mar 16, 2012 at 8:58 AM, Pulkit Singhal pulkitsinghal@gmail.com wrote:
So since its tougher to answer a question when something
actually cannot happen, I take it that there is no way to
start a local ES instance and perform an async replication of
the data from the cloud over to my machine?
On Thu, Mar 15, 2012 at 12:57 PM, Pulkit Singhal
<pulkitsinghal@gmail.com> wrote:
Unfortunately the SCP command to get the data
directory runs into problems early on:
====
Sending file modes: C0664 51257065 _5kr.fdt
Sink: C0664 51257065 _5kr.fdt
_5kr.fdt 46% 23MB 0.0KB/s - stalled -
====
So I guess that's not a good route to go.
On Thu, Mar 15, 2012 at 12:39 PM, Pulkit Singhal
<pulkitsinghal@gmail.com> wrote:
The following thread doesn't really answer my
question (may be because I don't get how to
set it up):
http://elasticsearch-users.115913.n3.nabble.com/how-to-dump-the-entire-contents-of-ES-td2758234.html
But Shay's comment, "copy of the data
directory of each production cluster, and move
it to development" makes sense. I'll try that
out but in the mean time any other direct
answers to my question would be most welcome.
On Thu, Mar 15, 2012 at 12:21 PM, gearond
<gearond@sbcglobal.net> wrote:
Want to know this also. Posting to
watch replies.
--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/How-to-get-a-local-ES-to-do-an-async-copy-over-replicate-over-of-Prod-ES-indices-tp3829318p3829442.html
Sent from the ElasticSearch Users
mailing list archive at Nabble.com.
Better way to make the transfer so that it can be picked up where its left
off is to:
a) zip up the data directory:
tar -zcvf data.tar.gz /opt/elasticsearch/data
b) use rync command that will let you pick up where you left off in case
something messes up:
rsync --rsh='ssh -i /users/xxx/.ec2/ec2.pem' --partial --progress
ec2-user@XXX.XXX.XXX.XXX:/opt/elasticsearch/data.tar.gz ~/dev/elasticsearch/
But Shay's comment, "copy of the data directory of each production
cluster, and move it to development" makes sense. I'll try that out but in
the mean time any other direct answers to my question would be most welcome.
That's truly awesome Clint!
For me its unfortunate that I'm not using Perl (therefore my working
knowledge gas atrophied) and this is also why I had trouble making the best
of the "terms of endearment" slides. Not sure if I can learn enough to
simply run with this but I'll try. Thansk for the great work.
Copying the files should have worked, not sure what that problem is.
There has been some discussions about this in the list. If you search
for "backup" in mailing list archive you should be able to find the
discussions.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
On Fri, Mar 16, 2012 at 8:58 AM, Pulkit Singhal pulkitsinghal@gmail.com wrote:
So since its tougher to answer a question when something
actually cannot happen, I take it that there is no way to
start a local ES instance and perform an async replication of
the data from the cloud over to my machine?
On Thu, Mar 15, 2012 at 12:57 PM, Pulkit Singhal
<pulkitsinghal@gmail.com> wrote:
Unfortunately the SCP command to get the data
directory runs into problems early on:
====
Sending file modes: C0664 51257065 _5kr.fdt
Sink: C0664 51257065 _5kr.fdt
_5kr.fdt 46% 23MB 0.0KB/s - stalled -
====
So I guess that's not a good route to go.
On Thu, Mar 15, 2012 at 12:39 PM, Pulkit Singhal
<pulkitsinghal@gmail.com> wrote:
The following thread doesn't really answer my
question (may be because I don't get how to
set it up):
But Shay's comment, "copy of the data
directory of each production cluster, and move
it to development" makes sense. I'll try that
out but in the mean time any other direct
answers to my question would be most welcome.
On Thu, Mar 15, 2012 at 12:21 PM, gearond
<gearond@sbcglobal.net> wrote:
Want to know this also. Posting to
watch replies.
--
View this message in context:
Yea, copying over the data location to your local environment should work,
not sure why it failed. Usually, its recommended that you disable flush
when doing so.
Another option, which clinton provided an example for, is to use scroll
search and reindex the data (or a portion of it, based on the query).
Better way to make the transfer so that it can be picked up where its left
off is to:
a) zip up the data directory:
tar -zcvf data.tar.gz /opt/elasticsearch/data
b) use rync command that will let you pick up where you left off in case
something messes up:
rsync --rsh='ssh -i /users/xxx/.ec2/ec2.pem' --partial --progress
ec2-user@XXX.XXX.XXX.XXX:/opt/elasticsearch/data.tar.gz
~/dev/elasticsearch/
But Shay's comment, "copy of the data directory of each production
cluster, and move it to development" makes sense. I'll try that out but in
the mean time any other direct answers to my question would be most welcome.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.