Understanding Index Shard Snapshot Failed Exception

I create snapshots of my indices daily and today I got this error:

"failures": [
{
"index": "name-2018.09.05",
"index_uuid": "name-2018.09.05",
"shard_id": 0,
"reason": """IndexShardSnapshotFailedException[Failed to get store file metadata]; nested: CorruptIndexException[verification failed : calculated=9w6ma3 stored=1wuvx8p (resource=VerifyingIndexInput(MMapIndexInput(path="PATH/data/nodes/0/indices/xwjuzxurSveAJ6Wpb06MKA/0/index/_nbd.fdt")))]; """,
"node_id": "ID",
"status": "INTERNAL_SERVER_ERROR"
}

I'm having trouble understanding this. Any help would be appreciated.

I am creating snapshots through elastic curator but I also get this if I do a PUT /_snapshot/

It seems that one of your indices has a corrupted shard:

CorruptIndexException

It drills down deeper to reveal even the path to the corrupt index file:

path="PATH/data/nodes/0/indices/xwjuzxurSveAJ6Wpb06MKA/0/index/_nbd.fdt"

Not sure why it's corrupted, but it's not able to be snapshotted in that state. You may be able to snapshot the non-corrupted shards by setting partial to true in your Curator config (or with the regular API call).

Thanks for responding.

Where can I read more about how to, if possible, fix this shard? A simple google search here wasn't too helpful.

We are creating replicas so can I just delete it?

That's a much deeper question. Is it the replica or the primary that is corrupt? If you don't know, then that makes it difficult to know which to address.

I'd ask a new question here in the forums, specifically targeting the shard corruption issue (rather than the snapshot failing) and see if one of the Elasticsearch core developers has a good answer for you.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.