Migrate data

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

Here is a post explaining how gateway works and how partial cluster failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machine.html.

Shards are allocated between nodes started, and I find it strange that half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djensen47@gmail.com wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster failure
works:http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

Shards are allocated between nodes started, and I find it strange that half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djensen47@gmail.com wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

Shards are allocated between nodes started, and I find it strange that
half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut
down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

Shards are allocated between nodes started, and I find it strange that
half
your docs disappeared when you shut down one node. They should not, since
you have replicas on the other node. In general, as long as you don't do
full cluster shutdown, you can bring more nodes into play and then shut
down
the other nodes. Shards will be allocated automatically between the nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

You would also not loose your data by shutting down one node if you have a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaoglu
http://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djensen47@gmail.com wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..
..

Shards are allocated between nodes started, and I find it strange
that
half
your docs disappeared when you shut down one node. They should not,
since
you have replicas on the other node. In general, as long as you don't
do
full cluster shutdown, you can bring more nodes into play and then
shut
down
the other nodes. Shards will be allocated automatically between the
nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to
move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you have a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Here is a post explaining how gateway works and how partial cluster
failure
works:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..
..

Shards are allocated between nodes started, and I find it strange
that
half
your docs disappeared when you shut down one node. They should not,
since
you have replicas on the other node. In general, as long as you don't
do
full cluster shutdown, you can bring more nodes into play and then
shut
down
the other nodes. Shards will be allocated automatically between the
nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen djense...@gmail.com
wrote:

I currently am running 3 small instance EC2 nodes and I want to
move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large EC2
instance already but if I shut down one of my small instances, then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

It sounds like you don't have any replicas. You may want to check the config
for your indexes. There are some REST API calls you can use to see how many
shards, replicas there are, status etc.
You should also check that the nodes are talking to each other successfully
just to be sure if you haven't done that already.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:58 PM, David Jensen djensen47@gmail.com wrote:

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you have
a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com
wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all
of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I
started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Here is a post explaining how gateway works and how partial
cluster
failure
works:

http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..

..

Shards are allocated between nodes started, and I find it strange
that
half
your docs disappeared when you shut down one node. They should
not,
since
you have replicas on the other node. In general, as long as you
don't
do
full cluster shutdown, you can bring more nodes into play and
then
shut
down
the other nodes. Shards will be allocated automatically between
the
nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen <
djense...@gmail.com>
wrote:

I currently am running 3 small instance EC2 nodes and I want to
move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large
EC2
instance already but if I shut down one of my small instances,
then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

I'm using the default settings. I'll check to see if the default has
replicas or not.

I'm also going to setup an S3 gateway, just in case.

The nodes are taking to each other. When I do a search on each node,
the results are the same for each.

Thanks for the help everybody, it's great when an open source project
has a supportive community.

On Jul 15, 7:03 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

It sounds like you don't have any replicas. You may want to check the config
for your indexes. There are some REST API calls you can use to see how many
shards, replicas there are, status etc.
You should also check that the nodes are talking to each other successfully
just to be sure if you haven't done that already.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:58 PM, David Jensen djense...@gmail.com wrote:

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you have
a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com
wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen djense...@gmail.com
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown, all
of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more disk
space so I moved the entire elasticsearch directory ... when I
started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Here is a post explaining how gateway works and how partial
cluster
failure
works:

http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..

..

Shards are allocated between nodes started, and I find it strange
that
half
your docs disappeared when you shut down one node. They should
not,
since
you have replicas on the other node. In general, as long as you
don't
do
full cluster shutdown, you can bring more nodes into play and
then
shut
down
the other nodes. Shards will be allocated automatically between
the
nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen <
djense...@gmail.com>
wrote:

I currently am running 3 small instance EC2 nodes and I want to
move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a large
EC2
instance already but if I shut down one of my small instances,
then
half the documents disappear. Again, I'm not too clear how the
failover and sharding is working behind the scenes.

Thanks,
David

Are you using 0.8? There was a similar problem in 0.8 that was fixed in
upcoming 0.9 (god, this version is long overdue, working hard on getting it
out). Note, current master (0.9) does not have a functioning cloud gateway
plugin.

-shay.banon

On Sat, Jul 17, 2010 at 1:47 AM, David Jensen djensen47@gmail.com wrote:

I'm using the default settings. I'll check to see if the default has
replicas or not.

I'm also going to setup an S3 gateway, just in case.

The nodes are taking to each other. When I do a search on each node,
the results are the same for each.

Thanks for the help everybody, it's great when an open source project
has a supportive community.

On Jul 15, 7:03 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

It sounds like you don't have any replicas. You may want to check the
config
for your indexes. There are some REST API calls you can use to see how
many
shards, replicas there are, status etc.
You should also check that the nodes are talking to each other
successfully
just to be sure if you haven't done that already.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:58 PM, David Jensen djense...@gmail.com
wrote:

I don't know what I'm doing wrong but every time I kill off a node, I
lose half my documents. I guess I need more replicas?

On Jul 15, 4:35 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

You would also not loose your data by shutting down one node if you
have
a
replica (which should be the case normally)

Regards,
Berkay Mollamustafaogluhttp://www.ifountain.com
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype

On Thu, Jul 15, 2010 at 7:30 PM, David Jensen djense...@gmail.com
wrote:

I posted first, read second.

It's not terribly explicit but, yes, implicitly, shutting down
means
losing your data unless there is a gateway.

Does each node need it's own gateway?

On Jul 15, 4:18 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Thats how elasticsearch works. Read the blog I posted.

On Fri, Jul 16, 2010 at 2:13 AM, David Jensen <
djense...@gmail.com>
wrote:

I'm using the cloud discovery.

Here's another weird happenstance. On the node that I shutdown,
all
of
the index files disappeared. Poof, gone.

This happened once before too. I realized that I needed more
disk
space so I moved the entire elasticsearch directory ... when I
started
up again, all of the index files were gone.

On Jul 15, 4:08 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Here is a post explaining how gateway works and how partial
cluster
failure
works:

http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin..

..

Shards are allocated between nodes started, and I find it
strange
that
half
your docs disappeared when you shut down one node. They
should
not,
since
you have replicas on the other node. In general, as long as
you
don't
do
full cluster shutdown, you can bring more nodes into play and
then
shut
down
the other nodes. Shards will be allocated automatically
between
the
nodes
(with the replicas).

Do you use unicast discovery in amazon? How do you configure
it?

-shay.banon

On Fri, Jul 16, 2010 at 2:04 AM, David Jensen <
djense...@gmail.com>
wrote:

I currently am running 3 small instance EC2 nodes and I
want to
move
to 2 large EC2 instances.

What is the best way to migrate the data? I brought up a
large
EC2
instance already but if I shut down one of my small
instances,
then
half the documents disappear. Again, I'm not too clear how
the
failover and sharding is working behind the scenes.

Thanks,
David