Backup & retention strategy

eirc · September 16, 2016, 10:04am

Hello,

we are new to the Elastic Stack and we're trying to design our backup & data retention strategy. Our use case is for logs and so we have the standard index per day setup.

Before delving into Elasticsearch internals our business requirements are that we want recent logs fast to search, old logs up to some point backed up somewhere but we also want everything backed up and possible to restore in case of a fully catastrophic cluster failure.

After reading the relevant documentation we have come up with the following plan on how to leverage Elasticsearch features to cover all our requirements:

There are 4 stages our logs enter depending on their age:

Warm
fast searchable
high CPU cluster nodes
Cold
slow(er) searchable
low CPU cluster nodes
data compressed
data forcemerged
Warm backup
fast restore
indexes closed but still on disk
Cold backup
slow(er) restore
indexes deleted from disk
only available on snapshots

To ensure we have all the data backed up but our long term backups take up as little space as possible we're gonna use two snapshot repositories:

Long term
Daily snapshot indices after they have been moved to the Cold stage and so they have first been compressed and optimised
Short term
Snapshot indices in the Warm stage as often as possible
Keep some increasing back-off interval of snapshots around and delete the rest (e.g. if the snapshot runs every minute keep all minute snapshots of the last 20 minutes and all hourly snapshots for the last 24 hours)

So to implement this strategy the following steps are run daily:

Delete indices older than Warm backup max age
Close indices older than Cold max age
Reallocate to cold nodes indices older than Warm max age
Forcemerge indices older than Warm max age
Snapshot indices older than Warm max age
Delete snapshots older than Cold backup max age

And the following steps are run as often as possible

Snapshot warm indices
Delete unneeded snapshots as explained above in Short term

For the implementation we plan on using curator since it seems to be the best way of keeping this kind of configuration on a human readable format.

So... does this whole approach make sense? Have we missed any steps in the implementation? Is there some better or simpler way to do this?

Note that we have been largely inspired by this GitHub comment, it was a great read.

JoarSvensson · September 16, 2016, 11:38am

This approach makes sense to me. It's similar to what I usualy recommend, depending on the use case and requirement of course.

Topic		Replies	Views
Questions about backup strategy Elasticsearch	4	3372	May 27, 2019
Snapshot Strategy for archival Elasticsearch	3	1044	September 7, 2017
Snapshot strategy Elasticsearch	3	1952	October 2, 2018
Monthly backup and restore along with snapshot Elasticsearch	3	1774	December 6, 2019
Hot backup strategy for Elasticsearch Elasticsearch	7	1870	July 6, 2017

Backup & retention strategy

Related topics