We are planning to use ES as a noSQL database solution. To do that we need
to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly
impacted when we did a full cluster restart, as it seems to recover the
cluster from the last backup. Using the local gateway, it performed well.
We thought of using more then one gateway at a time, but the system won't
start correctly. To do the index backup using the rsync solution that was
presented in the list, we will probably take some time to setup each client
and there is the risk of forgetting some nodes out of the backup. If this
happens, we can loose some shards.
Is there any solution that was implemented to solve this kind of problem in
an ES environment ? Can we use more than one gateway at a time? If so, how
can we use it to a backup solution.
The recommended way is to use local gateway and rsync / copy over the files from the data location of each node (disable the translog flush before issuing the rsync command). Hopefully, in the future, we will have an API to do backup with more options and be a bit more usable.
On Wednesday, March 14, 2012 at 5:44 AM, Luiz Guilherme Pais dos Santos wrote:
We are planning to use ES as a noSQL database solution. To do that we need to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly impacted when we did a full cluster restart, as it seems to recover the cluster from the last backup. Using the local gateway, it performed well.
We thought of using more then one gateway at a time, but the system won't start correctly. To do the index backup using the rsync solution that was presented in the list, we will probably take some time to setup each client and there is the risk of forgetting some nodes out of the backup. If this happens, we can loose some shards.
Is there any solution that was implemented to solve this kind of problem in an ES environment ? Can we use more than one gateway at a time? If so, how can we use it to a backup solution.
In a cluster where a node may not have all the shards, what's the
easiest/recommended way to get a node to have all the shards to do a full
backup?
On Wednesday, 14 March 2012 23:31:59 UTC+11, kimchy wrote:
The recommended way is to use local gateway and rsync / copy over the
files from the data location of each node (disable the translog flush
before issuing the rsync command). Hopefully, in the future, we will have
an API to do backup with more options and be a bit more usable.
On Wednesday, March 14, 2012 at 5:44 AM, Luiz Guilherme Pais dos Santos
wrote:
We are planning to use ES as a noSQL database solution. To do that we need
to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly
impacted when we did a full cluster restart, as it seems to recover the
cluster from the last backup. Using the local gateway, it performed well.
We thought of using more then one gateway at a time, but the system won't
start correctly. To do the index backup using the rsync solution that was
presented in the list, we will probably take some time to setup each client
and there is the risk of forgetting some nodes out of the backup. If this
happens, we can loose some shards.
Is there any solution that was implemented to solve this kind of problem
in an ES environment ? Can we use more than one gateway at a time? If so,
how can we use it to a backup solution.
We are using 5 clusters and each cluster having 3 nodes.
whether i need to rsync the "data" directory alone . is it enough to
take backup from any one of the node for each cluster. Also can i recover
the cluster using
that data directory when some data crashes occur..?
In the above script they disabling flush to stop indexing data's/records
to that node(Am i right ..?) What happens if any datas indexed at that
time..?
On Wednesday, March 14, 2012 6:01:59 PM UTC+5:30, kimchy wrote:
The recommended way is to use local gateway and rsync / copy over the
files from the data location of each node (disable the translog flush
before issuing the rsync command). Hopefully, in the future, we will have
an API to do backup with more options and be a bit more usable.
On Wednesday, March 14, 2012 at 5:44 AM, Luiz Guilherme Pais dos Santos
wrote:
We are planning to use ES as a noSQL database solution. To do that we need
to generate a daily backup and copy it to tape.
We tried to use fs gateway but we noticed that the recovery was greatly
impacted when we did a full cluster restart, as it seems to recover the
cluster from the last backup. Using the local gateway, it performed well.
We thought of using more then one gateway at a time, but the system won't
start correctly. To do the index backup using the rsync solution that was
presented in the list, we will probably take some time to setup each client
and there is the risk of forgetting some nodes out of the backup. If this
happens, we can loose some shards.
Is there any solution that was implemented to solve this kind of problem
in an ES environment ? Can we use more than one gateway at a time? If so,
how can we use it to a backup solution.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.