Snapshots to s3; file TTL

sirkubax_1 · May 8, 2016, 6:53pm

Hi

I'm creating snapshot to S3

In my bucket I set TTL to 365 days, so the old files would be removed after 1 year.

via

The index snapshot process is incremental. In the process of making the index snapshot Elasticsearch analyses the list of the index files that are already stored in the repository and copies only files that were created or changed since the last snapshot.

I'm afraid, if I'd be able to restore 'most recent' (younger that 1 year) snapshot, when the older files would be deleted.
Have You been testing, would the mechanism work, if some 1-year-old files would vanish?

warkolm · May 8, 2016, 8:55pm

Yes. The process will only delete files if they are not related to older snapshots.

sirkubax_1 · May 9, 2016, 7:46am

Not sure You I was clear.
It is not the Elasticsearch s3 plugin that is going to remove the files
The files have Time-To-Live=365 days and they would be removed by the S3 filesystem (automatically - no process involved), regardless if they are related to other snapshot or not. that is why I need to check if the snapshot can restore 'it's part' from it's data.

It is important, to be able to restore 'today's snapshot' without 'yesterday files'

warkolm · May 9, 2016, 7:46am

Right, then that won't work.
S3 will potentially destroy your snapshots.

sirkubax_1 · May 9, 2016, 8:29am

Well, You should not give up so easily

I've just look around, probably changing the S3 TTL (life-cycle) rule to auto-remove files matching pattern only (like indices/logstash-201*) should solve the problem.

I guess we do care about index, metadata* snapshot* files, but the indices/* that are not a part of a snapshot that we are restoring, could be missing, and still it would succeed.

warkolm · May 9, 2016, 10:16am

It won't work.
You need to delete the snapshots via the API.

sirkubax_1 · May 9, 2016, 10:09pm

Hi

I'm happy, because my guess was correct. Please take a look:

I do create a snapshot:
curl -XPUT "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening?wait_for_completion=true"

Do remove a index (2016.05.01)
curl -XDELETE 'http://localhost:9200/logstash-2016.05.01*?pretty'

Take another snapshot:
curl -XPUT "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening2?wait_for_completion=true"

Pretend to remove the index from filesystem snapshot (not via API)
mv /mnt/nfs/indices/logstash-2016.05.01/ /mnt/nfs/indices/logstash-2016.05.01_pseudoremove

The index content:
ls /mnt/nfs/indices/logstash-2016.05.01_pseudoremove 0 1 2 3 4 snapshot-2016.05.08 snapshot-2016.05.08_clean snapshot-2016.05.08_clean2 snapshot-2016.05.08_clean3 snapshot-2016.05.09_evening

curl -XPOST "localhost:9200/*/_close" #do I always have to close indices on 'full snapshot restore?'
{"acknowledged":true}

The first snapshot restore would fail
curl -XPOST "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening/_restore?pretty"
{
"error" : "SnapshotException[[my_fs_repository:2016.05.09_evening] failed to read metadata]; nested: FileNotFoundException[/mnt/nfs/indices/logstash-2016.05.01/snapshot-2016.05.09_evening (No such file or directory)]; ",
"status" : 500
}

But this one works
curl -XPOST "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening2/_restore?pretty"
{
"accepted" : true
}

So with right file-delete policy it works without API delete call.
Probably some snapshot-metadata may be left as orphans - but this is something I do accept.

Topic		Replies	Views
Delete some snapshots on S3 Elasticsearch	12	7280	November 12, 2018
Cloud-aws plugin overwrites snapshots Elasticsearch	5	821	July 5, 2017
Snapshot and restore \| S3 and Glacier Elasticsearch	19	10764	July 5, 2017
Managing Snapshot Files from Outside ES Elasticsearch	3	478	July 6, 2017
Snapshot to S3 - no delete permission Elasticsearch	2	1551	July 5, 2017

Snapshots to s3; file TTL

Related topics