Elastic Search Backup and Recovery


(ElasticFan) #1

Hi all,

We recently set up a big cluster where everyday we index around 50 million
records cumulatively sized over 40 GB, we use 3 big machines, 128 GB RAM,
dual hex core processors and 5 TB disk (only 7200 rpm but RAID 5). We do an
index per day as per the recommendations given in this discussion group.

We kept the setting as shard=2 and replica=2. This is where the question
begins. We would like to have an incremental Backup process so that if
there are any data corruption or accidental deletes or even if the whole
cluster goes down, we should be able to recover the data in full form.
Initially my plan was to do the following.

  1. Do FS Gateway to a remote system. Say a separate machine which will
    have 5 TB storage with RAID 5 and 15,000 rpm. (Is this doable?)
  2. Do regular tape backup on this backup machine.
  3. Restore the data to the Gateway

By doing this, if we find a problem, we will be able to rollback to a
certain point in time. My assumption here is that the FS Gateway will
persist both the state and data of the cluster. Is this assumption correct
and is this recommended? If not, what are the other ways to do a full data
recovery, I have replica:2 so the data will be present in 2 nodes all the
time and so I can not go with a single machine data backup. Please help me
with this.

Regards,
KS


(Berkay Mollamustafaoglu-2) #2

A quick correction. If you have set replica to 2, you have 2 copies plus
the master hence total of 3 copies. So you have same data at all nodes. If
you want to have just one copy, set replica to just 1.

Berkay

On Monday, July 16, 2012, ElasticFan wrote:

Hi all,

We recently set up a big cluster where everyday we index around 50 million
records cumulatively sized over 40 GB, we use 3 big machines, 128 GB RAM,
dual hex core processors and 5 TB disk (only 7200 rpm but RAID 5). We do an
index per day as per the recommendations given in this discussion group.

We kept the setting as shard=2 and replica=2. This is where the question
begins. We would like to have an incremental Backup process so that if
there are any data corruption or accidental deletes or even if the whole
cluster goes down, we should be able to recover the data in full form.
Initially my plan was to do the following.

  1. Do FS Gateway to a remote system. Say a separate machine which will
    have 5 TB storage with RAID 5 and 15,000 rpm. (Is this doable?)
  2. Do regular tape backup on this backup machine.
  3. Restore the data to the Gateway

By doing this, if we find a problem, we will be able to rollback to a
certain point in time. My assumption here is that the FS Gateway will
persist both the state and data of the cluster. Is this assumption correct
and is this recommended? If not, what are the other ways to do a full data
recovery, I have replica:2 so the data will be present in 2 nodes all the
time and so I can not go with a single machine data backup. Please help me
with this.

Regards,
KS

--
Regards,
Berkay Mollamustafaoglu
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype


(Paul Smith) #3

We use FS Gateway to an NFS mount on a different machine. Every night we
suspend the Gateway Snapshot via REST API, hard link copy the gateway
directory on this different machine temporarily, and then resume the
snapshotting. We then rsync this hardlink copy of the gateway back to a DR
location on the other side of the planet. So we now have a daily snapshot
ready to load in a different Data Center.

We then use Scrutineer (we wrote this):
https://github.com/Aconex/scrutineer to then be able to roll
forward/backward to sync it's state with the copy of the Database that is
using txn log shipping back to this same DR Data Center.

You can use Scrutineer on your live system to check for integrity errors
too, helps with detecting data mismatches, missing items etc. Beats a full
reindex that's for sure.

cheers,

Paul

On 17 July 2012 11:43, Berkay Mollamustafaoglu mberkay@gmail.com wrote:

A quick correction. If you have set replica to 2, you have 2 copies plus
the master hence total of 3 copies. So you have same data at all nodes. If
you want to have just one copy, set replica to just 1.

Berkay

On Monday, July 16, 2012, ElasticFan wrote:

Hi all,

We recently set up a big cluster where everyday we index around 50
million records cumulatively sized over 40 GB, we use 3 big machines, 128
GB RAM, dual hex core processors and 5 TB disk (only 7200 rpm but RAID 5).
We do an index per day as per the recommendations given in this discussion
group.

We kept the setting as shard=2 and replica=2. This is where the question
begins. We would like to have an incremental Backup process so that if
there are any data corruption or accidental deletes or even if the whole
cluster goes down, we should be able to recover the data in full form.
Initially my plan was to do the following.

  1. Do FS Gateway to a remote system. Say a separate machine which
    will have 5 TB storage with RAID 5 and 15,000 rpm. (Is this doable?)
  2. Do regular tape backup on this backup machine.
  3. Restore the data to the Gateway

By doing this, if we find a problem, we will be able to rollback to a
certain point in time. My assumption here is that the FS Gateway will
persist both the state and data of the cluster. Is this assumption correct
and is this recommended? If not, what are the other ways to do a full data
recovery, I have replica:2 so the data will be present in 2 nodes all the
time and so I can not go with a single machine data backup. Please help me
with this.

Regards,
KS

--
Regards,
Berkay Mollamustafaoglu
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype


(ElasticFan) #4

Hi Berkay,

You are correct. The replica was set to 1. Master and one replica, but 3
nodes in total.

Hi Paul,

Thank you so much. I was doing rsynch on the ES data directory as a
temporary backup solution. Could you please guide me in setting up the FS
Gateway correctly. I tried something and I could not restore the data.

  1. Could you please show me what and all are the settings that you made
    and in which config files?
  2. Is incremental index possible on the FS Gateway snapshot?
  3. Am I correct if I have understood that FS Gateway snapshot will have
    both the state and also the data?
  4. How to restore the data back in Cluster. Is this like, restore the
    snapshot and then restart the cluster?
  5. Are you using it in production if yes, do know the time it would take
    for restoring a Cluster of 900GB data in total?

Sorry for firing away all these questions. Any help would be much
appreciated.

Thank you. This thread would help a lot of poor souls trying to find an
optimum disaster recovery solution.

Regards,
KS

On Tue, Jul 17, 2012 at 8:57 AM, Paul Smith tallpsmith@gmail.com wrote:

We use FS Gateway to an NFS mount on a different machine. Every night we
suspend the Gateway Snapshot via REST API, hard link copy the gateway
directory on this different machine temporarily, and then resume the
snapshotting. We then rsync this hardlink copy of the gateway back to a DR
location on the other side of the planet. So we now have a daily snapshot
ready to load in a different Data Center.

We then use Scrutineer (we wrote this):
https://github.com/Aconex/scrutineer to then be able to roll
forward/backward to sync it's state with the copy of the Database that is
using txn log shipping back to this same DR Data Center.

You can use Scrutineer on your live system to check for integrity errors
too, helps with detecting data mismatches, missing items etc. Beats a full
reindex that's for sure.

cheers,

Paul

On 17 July 2012 11:43, Berkay Mollamustafaoglu mberkay@gmail.com wrote:

A quick correction. If you have set replica to 2, you have 2 copies plus
the master hence total of 3 copies. So you have same data at all nodes. If
you want to have just one copy, set replica to just 1.

Berkay

On Monday, July 16, 2012, ElasticFan wrote:

Hi all,

We recently set up a big cluster where everyday we index around 50
million records cumulatively sized over 40 GB, we use 3 big machines, 128
GB RAM, dual hex core processors and 5 TB disk (only 7200 rpm but RAID 5).
We do an index per day as per the recommendations given in this discussion
group.

We kept the setting as shard=2 and replica=2. This is where the question
begins. We would like to have an incremental Backup process so that if
there are any data corruption or accidental deletes or even if the whole
cluster goes down, we should be able to recover the data in full form.
Initially my plan was to do the following.

  1. Do FS Gateway to a remote system. Say a separate machine which
    will have 5 TB storage with RAID 5 and 15,000 rpm. (Is this doable?)
  2. Do regular tape backup on this backup machine.
  3. Restore the data to the Gateway

By doing this, if we find a problem, we will be able to rollback to a
certain point in time. My assumption here is that the FS Gateway will
persist both the state and data of the cluster. Is this assumption correct
and is this recommended? If not, what are the other ways to do a full data
recovery, I have replica:2 so the data will be present in 2 nodes all the
time and so I can not go with a single machine data backup. Please help me
with this.

Regards,
KS

--
Regards,
Berkay Mollamustafaoglu
Ph: +1 (571) 766-6292
mberkay on yahoo, google and skype


(Paul Smith) #5

Hi Paul,

Thank you so much. I was doing rsynch on the ES data directory as a
temporary backup solution. Could you please guide me in setting up the FS
Gateway correctly. I tried something and I could not restore the data.

  1. Could you please show me what and all are the settings that you
    made and in which config files?

The gateway setting is very simple...:

Gateway is shared FS

gateway:
type: fs
fs:
location: /mnt/esgateway/

/mnt/esgateway is an NFS mounted share to a separate physical host to all
the ES nodes. All ES nodes have this same share point.

  1. Is incremental index possible on the FS Gateway snapshot?

No with a bit of yes, but it's not what you think it is. At the end of
the day the segments are files, and the incremental 'sync' difference is
relatively low since the smaller segments generally merge together, more
often than not leaving the bigger segments unchanged until a larger merge
happens, so the rsync tends to have good 'saving' in terms of not needing
to sync too many large files - all until a very large merge or an Optimize
is done, and then it's sort of all brand new files again.

So it's mostly no, it's always a 'full' sync, but there's lots of savings
there. Maybe you're asking a different question though.

  1. Am I correct if I have understood that FS Gateway snapshot will
    have both the state and also the data?

yes, the gateway includes cluster state (metadata) and the indices
directories

  1. How to restore the data back in Cluster. Is this like, restore the
    snapshot and then restart the cluster?

While there's no 'restore' tool (as yet), all we do in our DR is:

  • have the DR cluster shutdown
  • wipe clean the local data directory for each node
  • Ensure the DR cluster has a configuration with the FS Gateway pointed to
    an NFS share with the replicated copy of the gateway
  • Start up the cluster

All the nodes now recover their state from the gateway. (we have multiple
DC's using this DR data centre as a location, so we deliberately purge any
local node state to ensure we get a clean recovery for the DC coming into
the DR location.

  1. Are you using it in production if yes, do know the time it would
    take for restoring a Cluster of 900GB data in total?

Yes we do. 900 Gb's a good size for sure. Ours are only in the
up-to-100-Gb mark. You'll have to do some of your own testing on that,
it'll be hardware/environment specific on how long that takes (Disk RAID
setup, network bandwidth, number of nodes for parallel recovery etc). At
the end of the day it's how fast you can transfer the shard contents to the
relevant nodes.

I'm guessing here actually (Shay or others could confirm?), but I believe
the Master 'delegates' the node to recover specific shards from the shared
gateway, so the central location will be hit from all nodes to recover
from, so that host is probably the limiting resource factor (Disk & Network
bandwidth on that node).

Paul Smith


(system) #6