Local gateway backup coordination

(James Bardin) #1

Using the local gateway, and backing up the data directory seems to be an
often recommended and easy solution, but is this feasible as a cluster
grows (without a central coordinator)? Aren't there inherently going to be
timing issues when flushing, or setting ' translog.disable_flush: false'
from each node across the cluster?

I'm not clear on whether it is possible to script this independently on
each node, coordinating somehow within the semantics of the elasticsearch
cluster (using a simple script such as this
https://gist.github.com/1074906), or should I start with a single system to
direct the backups cluster-wide?

Also, this particular cluster isn't on AWS, otherwise I would probably use
the S3 gateway.


(Karussell-2) #2

Just for the records as this message is a bit old ...
I think you should definitely time the flush enable/disable calls when
having multiple servers ALTHOUGH in some cases one copy is sufficient (e.g.
2 servers and replica=1 for all indices).

But timing your calls can be done when calling rsync for every server **

Hope this could help someone,

the first times rsync can take a bit long - do not disable flusing

echo "(SERVER1) rsync from $FROM to $TO1"
ssh $SERVER1 "rsync -a $FROM $TO1"

echo "(SERVER2) rsync from $FROM to $TO2"
ssh $SERVER2 "rsync -a $FROM $TO2"

now disable flushing and do one manual flushing

$SCRIPTS/es-flush-disable.sh true

echo "rsync both again ..."
ssh $SERVER1 "rsync -a $FROM $TO1"
ssh $SERVER2 "rsync -a $FROM $TO2"

$SCRIPTS/es-flush-disable.sh false


(system) #3