Unable to delete snapshot - no space left on device - Solved?

mcosta · January 20, 2021, 3:57pm

Hi all,

Using 7.10.2 Basic licence

After let my snapshot file system run out of space (100% full), I face the error were no more daily snapshots could run. This file system has nothing but snapshots, so I couldn't delete other files.

Trying to delete some older snapshots also failed with "no space let on device" error.

Adding other disc or somehow increase available space was not an option here, so I pick some
snapshot files to delete. That's what I did:

On configured repository mount point, deleted these two files with system date that match an old snapshot date:

meta-3o1MNGorREq7FuFlDpJ90Q.dat
snap-3o1MNGorREq7FuFlDpJ90Q.dat

and on indices subdirectory, deleted all folders with same system date.

After that, I deleted one more old snapshot, using Kibana's Snapshot and Restore. No errors occurred and both deleted snapshots are no longer listed in Kibana's Snapshot and Restore snapshot list.

Snapshot file system is now 80% used

Adjust your snapshot Policy (ex. retention days) if needed to avoid future problems.

I'm not sure if this procedure can be replicated or if was just luck to have file system dates that match snapshot dates. Does anyone tried this approach?

Thanks.

DavidTurner · January 20, 2021, 4:56pm

It was just luck if this worked: manually deleting files from a snapshot repository can render it completely unreadable. It might bite you in future too, even if it seems to be working now there's no guarantee it'll carry on working in future.

mcosta · January 20, 2021, 7:29pm

I know that's not how this should be done.
Maybe file system space monitoring is the way but I guess Elastic could prevent this to happen by not using all available space on file system, so we could at least start to delete older backups.

Is there any way to know how much space each snapshot takes so we could estimate a new file system size?

Thank you

Armin_Braun · January 20, 2021, 7:46pm

I'm afraid no such thing exists yet. We have an open issue for implementing functionality for this here:

github.com/elastic/elasticsearch

Implement Snapshot Delete Dry Run

opened 07:07AM - 13 May 20 UTC

original-brownbear

>enhancement :Distributed/Snapshot/Restore Team:Distributed

We've had various requests for a way of determining the storage savings that cou…ld be had from deleting a snapshot before deleting it. Also, we need a way to determine the size of a snapshot repository via an API call. After discussing this during snapshot resiliency sync we decided that both things could be had by adding a dry-run API for snapshot deletes that returns the bytes that could be saved for a given delete request. This would allow getting the incremental size of a snapshot when dry-running single snapshot deletes as well as getting the size of the full repository by dry-running a delete for all snapshots via snapshot name `*`.

DavidTurner · January 20, 2021, 7:50pm

I opened an issue to suggest that:

github.com/elastic/elasticsearch

Fail writing snapshot data to filesystem repo if space running out

opened 07:49PM - 20 Jan 21 UTC

DaveCTurner

>enhancement :Distributed/Snapshot/Restore Team:Distributed

A user [reported running out of disk space](https://discuss.elastic.co/t/unable-…to-delete-snapshot-no-space-left-on-device-solved/261675) in their shared filesystem repository which left it completely stuck, unable to take any further actions since everything that might delete any existing data (even repository cleanup AFAICT) starts by writing another metadata file to the repository before proceeding and there wasn't even the space to do that. Perhaps we should refuse to write data blobs (but not metadata blobs) to a shared filesystem repository when it is nearly full, leaving at least a few MB of wiggle room for cleanup and recovery from filling up the disk. ---- ### Workaround 1. When space runs out: a. disable SLM b. ensure there are no ongoing snapshots c. extend the filesystem that contains the repo by 100MiB or so d. delete some snapshots to free up space e. shrink the filesystem to its original size. #### Alternative workaround 1. Ahead of time, create a ~100MiB file in the same filesystem as the repo to reserve some space. 2. When space runs out: a. disable SLM b. ensure there are no ongoing snapshots c. delete the reserved-space file created in step 1 d. delete some snapshots to free up space e. create another ~100MiB reserved-space file

system · February 17, 2021, 7:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Snapshot - no space left on device Elasticsearch snapshot-and-restore	1	766	October 13, 2022
I can't do any snapshot related task due to error "No space left on device" Elasticsearch snapshot-and-restore	2	442	July 27, 2021
Delete snapshot fails due to no space on disk Elasticsearch	2	2539	February 14, 2018
Hi, my backup disk is out of space Elasticsearch	1	329	March 26, 2020
Curator failed to delete snapshot: No space left on device Elasticsearch curator	3	707	November 16, 2021

Unable to delete snapshot - no space left on device - Solved?

Related topics