Correct way to restart a cluster?


(phobos182) #1

ElasticSearch is re-copying all of the shards it's a replica of on restart. Meaning that if I have a 10 node cluster, and I restart one node, the node will copy over all of the data since the last restart. Even though the restart lasted about 10 seconds.

Does ElasticSearch have any way to detect missing files, and just copy them over instead of the entire index? How would I restart a node without re-copying terabytes worth of information?


(Shay Banon) #2

There isn't a simple way to restart a node without causing it to start
migrating shards.

But, using the new cluster level update settings API, we can allow for that.
Basically, we can have a setting that suspends allocations, you can set it
using the cluster level update settings API, and then, once the restart is
done, enable it... .

I opened an issue:
https://github.com/elasticsearch/elasticsearch/issues/1358. There is already
an issue open to allow for less distributive node restart action.

-shay.banon

On Thu, Sep 22, 2011 at 7:30 PM, phobos182 phobos182@gmail.com wrote:

ElasticSearch is re-copying all of the shards it's a replica of on restart.
Meaning that if I have a 10 node cluster, and I restart one node, the node
will copy over all of the data since the last restart. Even though the
restart lasted about 10 seconds.

Does ElasticSearch have any way to detect missing files, and just copy them
over instead of the entire index? How would I restart a node without
re-copying terabytes worth of information?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Correct-way-to-restart-a-cluster-tp3359340p3359340.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(phobos182) #3

Thanks. I just saw that you pushed the change. Thanks for that! At least I know that I will not cause needless network traffic for rolling restarts.

I have one simple question regarding shard replica transfer. I know that with Solr a slave checks all segment files in the directory and only copies segments that it does not have, or newly merged segments. Basically a 'rsync' so it's as efficient as possible.

Does ElasticSearch has any of this type of logic when copying shards to replicas? I ask because we have on order of 40TB of indexes w/replicas that is constantly being updated in real time. I do need to perform rolling restarts for upgrades to cluster members.


(Shay Banon) #4

Yes, elasticsearch will reuse same index files if possible when it does
allocation of replicas. Note, replication is done quite differently though,
see more about it in this session:
http://www.elasticsearch.org/videos/2011/08/09/road-to-a-distributed-searchengine-berlinbuzzwords.html
.

On Mon, Sep 26, 2011 at 4:02 PM, phobos182 phobos182@gmail.com wrote:

Thanks. I just saw that you pushed the change. Thanks for that! At least I
know that I will not cause needless network traffic for rolling restarts.

I have one simple question regarding shard replica transfer. I know that
with Solr a slave checks all segment files in the directory and only copies
segments that it does not have, or newly merged segments. Basically a
'rsync' so it's as efficient as possible.

Does ElasticSearch has any of this type of logic when copying shards to
replicas? I ask because we have on order of 40TB of indexes w/replicas that
is constantly being updated in real time. I do need to perform rolling
restarts for upgrades to cluster members.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Correct-way-to-restart-a-cluster-tp3359340p3369170.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(system) #5