Snapshots to s3; file TTL

Hi

I'm creating snapshot to S3

In my bucket I set TTL to 365 days, so the old files would be removed after 1 year.

via
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

The index snapshot process is incremental. In the process of making the index snapshot Elasticsearch analyses the list of the index files that are already stored in the repository and copies only files that were created or changed since the last snapshot.

I'm afraid, if I'd be able to restore 'most recent' (younger that 1 year) snapshot, when the older files would be deleted.
Have You been testing, would the mechanism work, if some 1-year-old files would vanish?

Yes. The process will only delete files if they are not related to older snapshots.

Not sure You I was clear.
It is not the Elasticsearch s3 plugin that is going to remove the files
The files have Time-To-Live=365 days and they would be removed by the S3 filesystem (automatically - no process involved), regardless if they are related to other snapshot or not. that is why I need to check if the snapshot can restore 'it's part' from it's data.

It is important, to be able to restore 'today's snapshot' without 'yesterday files'

Right, then that won't work.
S3 will potentially destroy your snapshots.

Well, You should not give up so easily :slight_smile:

I've just look around, probably changing the S3 TTL (life-cycle) rule to auto-remove files matching pattern only (like indices/logstash-201*) should solve the problem.

I guess we do care about index, metadata* snapshot* files, but the indices/* that are not a part of a snapshot that we are restoring, could be missing, and still it would succeed.

It won't work.
You need to delete the snapshots via the API.

Hi

I'm happy, because my guess was correct. Please take a look:

I do create a snapshot:
curl -XPUT "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening?wait_for_completion=true"

Do remove a index (2016.05.01)
curl -XDELETE 'http://localhost:9200/logstash-2016.05.01*?pretty'

Take another snapshot:
curl -XPUT "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening2?wait_for_completion=true"

Pretend to remove the index from filesystem snapshot (not via API)
mv /mnt/nfs/indices/logstash-2016.05.01/ /mnt/nfs/indices/logstash-2016.05.01_pseudoremove

The index content:
ls /mnt/nfs/indices/logstash-2016.05.01_pseudoremove 0 1 2 3 4 snapshot-2016.05.08 snapshot-2016.05.08_clean snapshot-2016.05.08_clean2 snapshot-2016.05.08_clean3 snapshot-2016.05.09_evening

curl -XPOST "localhost:9200/*/_close" #do I always have to close indices on 'full snapshot restore?'
{"acknowledged":true}

The first snapshot restore would fail
curl -XPOST "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening/_restore?pretty"
{
"error" : "SnapshotException[[my_fs_repository:2016.05.09_evening] failed to read metadata]; nested: FileNotFoundException[/mnt/nfs/indices/logstash-2016.05.01/snapshot-2016.05.09_evening (No such file or directory)]; ",
"status" : 500
}

But this one works
curl -XPOST "localhost:9200/_snapshot/my_fs_repository/2016.05.09_evening2/_restore?pretty"
{
"accepted" : true
}

:slight_smile:
So with right file-delete policy it works without API delete call.
Probably some snapshot-metadata may be left as orphans - but this is something I do accept.