Daily cluster backup

We are planning to use ES as a noSQL database solution. To do that we need
to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly
impacted when we did a full cluster restart, as it seems to recover the
cluster from the last backup. Using the local gateway, it performed well.

We thought of using more then one gateway at a time, but the system won't
start correctly. To do the index backup using the rsync solution that was
presented in the list, we will probably take some time to setup each client
and there is the risk of forgetting some nodes out of the backup. If this
happens, we can loose some shards.

Is there any solution that was implemented to solve this kind of problem in
an ES environment ? Can we use more than one gateway at a time? If so, how
can we use it to a backup solution.

--
Luiz Guilherme P. Santos

The recommended way is to use local gateway and rsync / copy over the files from the data location of each node (disable the translog flush before issuing the rsync command). Hopefully, in the future, we will have an API to do backup with more options and be a bit more usable.

On Wednesday, March 14, 2012 at 5:44 AM, Luiz Guilherme Pais dos Santos wrote:

We are planning to use ES as a noSQL database solution. To do that we need to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly impacted when we did a full cluster restart, as it seems to recover the cluster from the last backup. Using the local gateway, it performed well.

We thought of using more then one gateway at a time, but the system won't start correctly. To do the index backup using the rsync solution that was presented in the list, we will probably take some time to setup each client and there is the risk of forgetting some nodes out of the backup. If this happens, we can loose some shards.

Is there any solution that was implemented to solve this kind of problem in an ES environment ? Can we use more than one gateway at a time? If so, how can we use it to a backup solution.

--
Luiz Guilherme P. Santos

In a cluster where a node may not have all the shards, what's the
easiest/recommended way to get a node to have all the shards to do a full
backup?

On Wednesday, 14 March 2012 23:31:59 UTC+11, kimchy wrote:

The recommended way is to use local gateway and rsync / copy over the
files from the data location of each node (disable the translog flush
before issuing the rsync command). Hopefully, in the future, we will have
an API to do backup with more options and be a bit more usable.

On Wednesday, March 14, 2012 at 5:44 AM, Luiz Guilherme Pais dos Santos
wrote:

We are planning to use ES as a noSQL database solution. To do that we need
to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly
impacted when we did a full cluster restart, as it seems to recover the
cluster from the last backup. Using the local gateway, it performed well.

We thought of using more then one gateway at a time, but the system won't
start correctly. To do the index backup using the rsync solution that was
presented in the list, we will probably take some time to setup each client
and there is the risk of forgetting some nodes out of the backup. If this
happens, we can loose some shards.

Is there any solution that was implemented to solve this kind of problem
in an ES environment ? Can we use more than one gateway at a time? If so,
how can we use it to a backup solution.

--
Luiz Guilherme P. Santos

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Kimchy,

I am trying to follow this backup script

I have some few questions regarding this

  1. We are using 5 clusters and each cluster having 3 nodes.
  2. whether i need to rsync the "data" directory alone . is it enough to
    take backup from any one of the node for each cluster. Also can i recover
    the cluster using
    that data directory when some data crashes occur..?
  3. In the above script they disabling flush to stop indexing data's/records
    to that node(Am i right ..?) What happens if any datas indexed at that
    time..?

On Wednesday, March 14, 2012 6:01:59 PM UTC+5:30, kimchy wrote:

The recommended way is to use local gateway and rsync / copy over the
files from the data location of each node (disable the translog flush
before issuing the rsync command). Hopefully, in the future, we will have
an API to do backup with more options and be a bit more usable.

On Wednesday, March 14, 2012 at 5:44 AM, Luiz Guilherme Pais dos Santos
wrote:

We are planning to use ES as a noSQL database solution. To do that we need
to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly
impacted when we did a full cluster restart, as it seems to recover the
cluster from the last backup. Using the local gateway, it performed well.

We thought of using more then one gateway at a time, but the system won't
start correctly. To do the index backup using the rsync solution that was
presented in the list, we will probably take some time to setup each client
and there is the risk of forgetting some nodes out of the backup. If this
happens, we can loose some shards.

Is there any solution that was implemented to solve this kind of problem
in an ES environment ? Can we use more than one gateway at a time? If so,
how can we use it to a backup solution.

--
Luiz Guilherme P. Santos

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.