Migrate data


(David Jensen-2) #1

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(Shay Banon) #2

Here is a post explaining how gateway works and how partial cluster failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machine.html.

Shards are allocated between nodes started, and I find it strange that half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djensen47@gmail.com wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(David Jensen-2) #3

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster failure
works:http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

Shards are allocated between nodes started, and I find it strange that half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(Shay Banon) #4

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djensen47@gmail.com wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

Shards are allocated between nodes started, and I find it strange that
half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut
down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(David Jensen-2) #5

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

Shards are allocated between nodes started, and I find it strange that
half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut
down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(Berkay Mollamustafaoglu-2) #6

You would also not loose your data by shutting down one node if you have a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaoglu
http://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djensen47@gmail.com wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..
..

Shards are allocated between nodes started, and I find it strange
that

half

your docs disappeared when you shut down one node. They should not,
since

you have replicas on the other node. In general, as long as you don't
do

full cluster shutdown, you can bring more nodes into play and then
shut

down

the other nodes. Shards will be allocated automatically between the
nodes

(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to
move

to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(David Jensen-2) #7

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you have a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..
..

Shards are allocated between nodes started, and I find it strange
that

half

your docs disappeared when you shut down one node. They should not,
since

you have replicas on the other node. In general, as long as you don't
do

full cluster shutdown, you can bring more nodes into play and then
shut

down

the other nodes. Shards will be allocated automatically between the
nodes

(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to
move

to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(Berkay Mollamustafaoglu-2) #8

It sounds like you don't have any replicas. You may want to check the config
for your indexes. There are some REST API calls you can use to see how many
shards, replicas there are, status etc.
You should also check that the nodes are talking to each other successfully
just to be sure if you haven't done that already.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:58 PM, David Jensen djensen47@gmail.com wrote:

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you have
a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com
wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all
of

the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I
started

up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Here is a post explaining how gateway works and how partial
cluster

failure

works:

http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..

..

Shards are allocated between nodes started, and I find it strange
that

half

your docs disappeared when you shut down one node. They should
not,

since

you have replicas on the other node. In general, as long as you
don't

do

full cluster shutdown, you can bring more nodes into play and
then

shut

down

the other nodes. Shards will be allocated automatically between
the

nodes

(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen <
djense...@gmail.com>

wrote:

I currently am running 3 small instance EC2 nodes and I want to
move

to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large
EC2

instance already but if I shut down one of my small instances,
then

half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(David Jensen-2) #9

I'm using the default settings. I'll check to see if the default has
replicas or not.

I'm also going to setup an S3 gateway, just in case.

The nodes are taking to each other. When I do a search on each node,
the results are the same for each.

Thanks for the help everybody, it's great when an open source project
has a supportive community.

On Jul 15, 7:03 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

It sounds like you don't have any replicas. You may want to check the config
for your indexes. There are some REST API calls you can use to see how many
shards, replicas there are, status etc.
You should also check that the nodes are talking to each other successfully
just to be sure if you haven't done that already.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:58 PM, David Jensen djense...@gmail.com wrote:

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you have
a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com
wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all
of

the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I
started

up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Here is a post explaining how gateway works and how partial
cluster

failure

works:

http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..

..

Shards are allocated between nodes started, and I find it strange
that

half

your docs disappeared when you shut down one node. They should
not,

since

you have replicas on the other node. In general, as long as you
don't

do

full cluster shutdown, you can bring more nodes into play and
then

shut

down

the other nodes. Shards will be allocated automatically between
the

nodes

(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen <
djense...@gmail.com>

wrote:

I currently am running 3 small instance EC2 nodes and I want to
move

to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large
EC2

instance already but if I shut down one of my small instances,
then

half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David


(Shay Banon) #10

Are you using 0.8? There was a similar problem in 0.8 that was fixed in
upcoming 0.9 (god, this version is long overdue, working hard on getting it
out). Note, current master (0.9) does not have a functioning cloud gateway
plugin.

-shay.banon

On Sat, Jul 17, 2010 at 1:47 AM, David Jensen djensen47@gmail.com wrote:

I'm using the default settings. I'll check to see if the default has
replicas or not.

I'm also going to setup an S3 gateway, just in case.

The nodes are taking to each other. When I do a search on each node,
the results are the same for each.

Thanks for the help everybody, it's great when an open source project
has a supportive community.

On Jul 15, 7:03 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

It sounds like you don't have any replicas. You may want to check the
config
for your indexes. There are some REST API calls you can use to see how
many
shards, replicas there are, status etc.
You should also check that the nodes are talking to each other
successfully
just to be sure if you haven't done that already.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:58 PM, David Jensen djense...@gmail.com
wrote:

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you
have

a

replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com
wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down
means

losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen <
djense...@gmail.com>

wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown,
all

of

the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more
disk

space so I moved the entire elasticsearch directory ... when I
started

up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Here is a post explaining how gateway works and how partial
cluster

failure

works:

http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..

..

Shards are allocated between nodes started, and I find it
strange

that

half

your docs disappeared when you shut down one node. They
should

not,

since

you have replicas on the other node. In general, as long as
you

don't

do

full cluster shutdown, you can bring more nodes into play and
then

shut

down

the other nodes. Shards will be allocated automatically
between

the

nodes

(with the replicas).

Do you use unicast discovery in amazon? How do you configure
it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen <
djense...@gmail.com>

wrote:

I currently am running 3 small instance EC2 nodes and I
want to

move

to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a
large

EC2

instance already but if I shut down one of my small
instances,

then

half the documents disappear. Again, I'm not too clear how
the

failover and sharding is working behind the scenes.

Thanks,
David


(system) #11