Snapshot strategy

pri · September 3, 2018, 11:59pm

I have been reading about snapshot and restore functionality in Elasticsearch. Coming from relational database background, it took some effort to understand this correctly. Based on my understanding, I am trying to finalize the snapshot strategy for our production Elasticsearch cluster. We do not have time series indices currently (it might change in the near future).
I will be using curator for creating new and deleting older snapshots. The snapshots will be stored on Amazon S3.
Will have 3 repos: hourly, daily and weekly. Will have separate curator actions that will backup to these 3 repos. Cron jobs will be set-up to run the curator actions as follows: hourly snapshots every hour, once in a day snapshots and once in a week snapshots
As for deletion of older snapshots, there will be curator delete actions that will delete older snapshots than 24 hours from the hourly repo, older than 7 days from the daily repo and older than 4 weeks from the weekly repo.
I have the following questions:

Will this backup strategy cause any performance issues when creating/deleting snapshots?
Having 3 repos vs having a single repo and using different name patterns - which is advisable?
From the overall Elasticsearch backup strategy, is there any "good practices" documentation that anybody can refer?
Does this backup strategy fall in the "optimal" category or is it an overkill?

Thanks!

theuntergeek · September 4, 2018, 12:32am

It could, depending on how much happens on an ongoing basis, and how long it will take to perform one of the snapshots. Other potential problems include "collisions": If you try to take a snapshot while another (hourly, daily, or weekly) is already underway, it will fail because you cannot take multiple snapshots concurrently. It will not merely wait until the current snapshot completes and then proceed, either.

A single repo with different name patterns would permit re-use of existing segments, otherwise you could end up paying for extra data storage to accommodate.

Not especially, unfortunately. These are mainly determined by comfort level and desires for redundancy. Your current approach appears to be well thought out.

Considering that snapshots are at the segment level, and not the data level, this approach may be a bit overkill. But it's all about what your requirements are, not someone else's arbitrary idea of what you should need.

pri · September 4, 2018, 1:07am

Thank you for the quick and clear reply. It answered all my questions.

system · October 2, 2018, 1:08am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
WorkFlow for Snapshotting Indices as a Backup Option Elasticsearch	2	373	August 10, 2018
Snapshot - Repository S3 (Strategy) Elasticsearch	7	1207	November 7, 2017
Elasticsearch snapshots and storing over S3 bucket Elasticsearch	1	372	October 16, 2018
Setup to create daily snapshots and purge old snapshots Elasticsearch	2	1002	April 11, 2017
Snapshot Delete not working as expected Elasticsearch	6	734	October 8, 2018

Snapshot strategy

Related topics