Elasticsearch Automatic Snapshots - Curator?

Nikesh · January 23, 2019, 10:41am

Hi,

I'm looking to snapshot indexes on my local windows system(single node).
I am capable of creating snapshots based on indexes and restore them after deleting.
Although, this is done manually using Snapshot and Restore APIs, I wish to back up my indexes every day.
I see, curator was developed to achieve this. I installed Curator using Windows MSI Installer. I'm not sure as to where to use the Command Line Interface and locate my configuration file and action file.

Is Curator the right tool to achieve automatic snapshots or not? If yes, How do I configure its settings and and run it successfully? If not, What are the methods that I should be using?

theuntergeek · January 23, 2019, 12:47pm

Look in "C:\Program Files\elasticsearch-curator"

You will need to use a scheduler to call the curator binary periodically. For example, every day at 3am, you might have the scheduler run your snapshot by launching "C:\Program Files\elasticsearch-curator\curator.exe --config C:\config.yml C:\action.yml" (configure config.yml and action.yml accordingly).

Nikesh · January 23, 2019, 1:29pm

Thanks for the response. This helps.

I have an add on to my previous doubt
Using a scheduler, if i create a snapshot of a single index every day at 3am for 8 consecutive days,
Day 1 would be first back up of the Index.
Day 2 to Day 7 would be incremental backups
On Day 8, If I wish to merge Day 1 and Day 2 snapshot into single snapshot (lets say with same name as snapshot created on Day 2 ) and delete Day 1 snapshot and hold maximum of 7 snapshots at a single time. Is this a possible scenario?
Can this scenario be achieved by just deleting Day 1 snapshot and expect Day 2 snapshot hold segments of Day 1 snapshot as they are of same index?

theuntergeek · January 23, 2019, 2:10pm

You misunderstand the nature of snapshots.

Yes, snapshots are incremental, but they are incremental at the Elasticsearch segment level, not at the data level.

If I write 10 documents to Elasticsearch, and they are flushed into a new index as a single segment, then take a snapshot, I will have written that single segment to the snapshot repository.

Now imagine that we do this 100 times. It would be easy to assume that each snapshot will only contain the new segment. But behind the scenes, Elasticsearch will have merged several of the segments during the course of the new indexing operations. Instead of 100 segments, I will likely have 20, and some of the segments will have 20, 40, 50, or even up to 100 documents. Any new segment—whether created by newly indexed documents or by merging existing segments—will be treated as a new segment to the snapshot API. As such, while the snapshots will be incremental, any segment that does not exist in the snapshot repository will be copied over, even if it contains documents that already exist in a different segment which was previously copied to the repository.

A snapshot contains a list of pointers to segments which were present in the index or indices snapshotted. If a segment was already present, it will point to the pre-existing segment. This segment will not be re-copied from the cluster as a result. And, when it comes time to delete snapshots, if a newer snapshot has a reference pointer to a segment in a snapshot scheduled for deletion, that segment will not be deleted, as it is required still. No segments will be deleted from the repository so long as a single snapshot has a reference to it.

There is no concept of "merging" snapshots. A snapshot points to segments. If a segment is not already in the repository, it will be copied across as part of the snapshot process.

Hopefully this helps.

system · February 20, 2019, 2:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Incremental backup with Curator Elasticsearch	3	1526	December 12, 2018
WorkFlow for Snapshotting Indices as a Backup Option Elasticsearch	2	373	August 10, 2018
How to create Snapshot automatically? Elasticsearch	7	2188	November 25, 2017
Backups with Curator Elasticsearch curator	5	836	December 3, 2019
Snapshot newly created indices Elasticsearch	3	372	January 7, 2019

Elasticsearch Automatic Snapshots - Curator?

Related topics