Will a rolling restart lose data?


(Grant Rodgers) #1

I've been looking at the docs and couldn't find the answer to this, so
I thought I'd ask the list.

Say I've changed the configuration of my elasticsearch cluster, and I
want to restart every node to pick up the new config. Is there a
possibility of losing data? Seems like there could be if every
machine that hosts a shard happens to be restarting at the same time.

What's a good way to do a rolling restart without losing anything?

Thanks!


(Clinton Gormley) #2

Say I've changed the configuration of my elasticsearch cluster, and I
want to restart every node to pick up the new config. Is there a
possibility of losing data? Seems like there could be if every
machine that hosts a shard happens to be restarting at the same time.

What's a good way to do a rolling restart without losing anything?

From what I've seen of doing this in practise, before a node shuts down,
it syncs its data with the other nodes, and possibly tries to snapshot
the gateway (if required).

So, shutdown one node, restart it, wait for the cluster_health to be
'green' again (requires at least three running nodes, I think), then
move on to the next node.

clint


(Shay Banon) #3

When you shut down a node, shards allocated to it are reallocated to other
nodes. When you start it up again, shards are redistributed to it. If, for
example, you have set the number_of_replicas set to 1, then each shard will
have a replica, meaning that is its assigned to a specific node, it will
recover its state from another node that is running. If you have a gateway
set, then the first shard will recover its state from the gateway.

cheers,
shay.banon

On Sat, Apr 17, 2010 at 7:13 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

Say I've changed the configuration of my elasticsearch cluster, and I
want to restart every node to pick up the new config. Is there a
possibility of losing data? Seems like there could be if every
machine that hosts a shard happens to be restarting at the same time.

What's a good way to do a rolling restart without losing anything?

From what I've seen of doing this in practise, before a node shuts down,
it syncs its data with the other nodes, and possibly tries to snapshot
the gateway (if required).

So, shutdown one node, restart it, wait for the cluster_health to be
'green' again (requires at least three running nodes, I think), then
move on to the next node.

clint


(jimmy_wen) #4

where a complete cluster down happened , It will still lose data ,right?


(system) #5