Corrupted S3 Repository: 1.3M Metadata lines & snapshot_missing_exception (Cluster size: 6.5 TB)

Disha_Vala · February 9, 2026, 6:34am

Hi Elastic Team,

I am facing a repository corruption issue on Elasticsearch 8.5.3 and would appreciate guidance on how to perform a "surgical" repair of our snapshot metadata.

The Situation:

Cluster Scale: Our live cluster holds approximately 6.5 TB of data.
Metadata Scale: The repository metadata is extremely large listing snapshots results in ~1.3 million lines of JSON.
The Cause: An S3 Lifecycle Policy was active on the bucket and deleted older objects that the repository still references.

Why a fresh repository is not a viable option for us:

Storage Impact: If we create a new repo, the first full snapshot would require uploading 6.5 TB to S3, which is a massive operation in terms of time and cost.

Data Retention Gap: Our cluster policy deletes live indices older than 90 days once they are backed up. If we start a fresh repo, we lose all data older than 90 days because that historical data only exists in this current, partially corrupted repository.

Restore Risk: We understand that while backups are incremental, restores are not. If we try to restore snapshots older than 90 days to "re-seed" a new repo, we are afraid it will fail because S3 might already be missing references to necessary metadata. It also requires massive disk space that we want to avoid using.

How we noticed the issue:

Our daily snapshot policy began failing with this
INTERNAL_SERVER_ERROR:
NoSuchFileException[Blob object [logs02-02/indices/PkoCOa-6T5S-jdgA5PREXA/0/index-rzgmdcDFQY2S5hZLn1aPBQ] not found: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey)]

Specific missing snapshot error:
{
"type": "snapshot_missing_exception",
"reason": "[bo-elk-backup:logs02-02-2023.07.01-gpq2m0k1sf-mqo_e9ghfbw/efJOWaAuRtSsx--S4kksaQ] is missing",
"caused_by": {
"type": "no_such_file_exception",
"reason": "Blob object [logs02-02/snap-efJOWaAuRtSsx--S4kksaQ.dat] not found"
}
}

What we have verified:

Verification: POST /_snapshot/bo-elk-backup/_verify passes successfully.
S3 Config: Only one cluster has write access to this repository.
Lifecycle: We have now disabled the S3 Lifecycle policy to prevent further loss.

Our Questions:

Is there a way to prune references to these missing blobs to restore repository health without a full re-upload?
Does the _cleanup API in 8.5.3 effectively rewrite the index-N files if we manually delete the snapshots ( manually deleting snapshots also seems tricky since idk how many of them would be missing data ?

I am happy to provide full server-side logs if needed to help diagnose the exact point of failure during metadata parsing.

Thanks you so much in advance for all the suggestions .

DavidTurner · February 9, 2026, 12:31pm

require uploading 6.5 TB to S3, which is a massive operation in terms of time and cost.

I do not think this will cost very much money - S3 charges per request, not per byte, and the per-request costs are not huge.

backups are incremental

No, backups are deduplicated, not incremental. If metadata is missing then restores will fail.

Is there a way to prune references to these missing blobs to restore repository health without a full re-upload?

No, sorry, starting with a new repository is the only safe option.

Topic		Replies	Views
Elasticsearch 9.2.1 snapshot visible in repository but restore fails with snapshot_missing_exception Elasticsearch snapshot-and-restore	8	103	June 15, 2026
Snapshot & Restore - Missing/Corrupted Segments Elasticsearch snapshot-and-restore	2	270	October 2, 2024
Shared File System - Repository Issues Elasticsearch snapshot-and-restore	7	143	August 20, 2025
Detected a corrupted repository , index<name> references an unknown snapshot uuid Elasticsearch	5	1235	April 30, 2020
Snapshot indices to S3 repository_exception Elasticsearch snapshot-and-restore	0	923	January 17, 2022

Corrupted S3 Repository: 1.3M Metadata lines & snapshot_missing_exception (Cluster size: 6.5 TB)

Related topics