Great, that makes sense. Thank you all for your help -- much appreciated!
Best,
Alex
On Sat, Jul 2, 2011 at 4:32 PM, Shay Banon shay.banon@elasticsearch.com wrote:
Hi,
Let me explain. Using the local gateway (the default one) there is no
meaning for snapshotting, since the data is persisted on each node, and
restored (on cluster restart) from the different nodes local data location.
Snapshotting is there for the shared gateway option. Its the process of
getting the local data and persisting it (delta wise) to the shared storage.
It works by default in a scheduled manner, but, the gateway snapshot API is
there to explicitly invoke it.
As for wheat Karussell suggested. Its a mean to backup in case you are
using the local gateway. Where you can disable translog flushing, make a
copy of each node data location, and then resume it.
-shay.banon
On Saturday, July 2, 2011 at 11:27 PM, Karussell wrote:
On Jul 1, 11:04 pm, Clinton Gormley clin...@iannounce.co.uk wrote:
Hi Alex
elasticsearch/modules/elasticsearch/src/main/java/org/elasticsearch/
action/admin/indices/gateway/snapshot/GatewaySnapshotRequest.java
says:
- Gateway snapshot allows to explicitly perform a snapshot through
the gateway of one or more indices (backup them).
- By default, each index gateway periodically snapshot changes,
though it can be disabled and be controlled completely
- through this API. Best created using {@link
org.elasticsearch.client.Requests#gatewaySnapshotRequest(String...)}.
I'd like to learn more about this, I wasn't able to find any code or
configuration knobs for the periodic snapshot feature. I understand
that this may not be documented yet; if it isn't, where should I look
for the code?
First of all, why do you not want to use the local (default) gateway? It
performs best.
Second, snapshotting in the shared gateway happens automatically, you
don't need to think about it.
If you want to back up a data dir, you should just be able to copy or
rsync it. To be sure that no segments have merged in the meantime, you
can just run a second rsync on it, which should be much faster.
There is also an API where you can disable flushes + enable it when
you are done:
index.translog.disable_flush
Elasticsearch Platform — Find real-time answers at scale | Elastic