Missing Snapshot Error

I'm trying to use curator to delete all snapshots older then a 100 days in my repository. I noticed it was failing with the error message:

Failed to complete action: delete_snapshots. <class 'curator.exceptions.FailedExecution'>: Unable to get snapshot information from repository: dcselasticsnapshot. Error: TransportError(500, 'snapshot_exception', '[dcselasticsnapshot:citydirectory_ocrpages/XGftDvzIRPGe_wMfN6FsSw] is missing')

Looking into this further, I went and tried to list all snapshots that are currently in the repo:

GET /_cat/snapshots/dcselasticsnapshot?v&s=id

This again failed which proves to me it's not an issue with Curator but rather an issue with the snapshot API and the metadata associated with that. Intrigued, I tried to delete the snapshot, and not surprisingly I get the following:

{
  "error": {
    "root_cause": [
      {
        "type": "snapshot_missing_exception",
        "reason": "[dcselasticsnapshot:citydirectory_ocrpages/XGftDvzIRPGe_wMfN6FsSw] is missing"
      }
    ],
    "type": "snapshot_missing_exception",
    "reason": "[dcselasticsnapshot:citydirectory_ocrpages/XGftDvzIRPGe_wMfN6FsSw] is missing",
    "caused_by": {
      "type": "no_such_file_exception",
      "reason": "Blob object [snap-XGftDvzIRPGe_wMfN6FsSw.dat] not found: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 26ED0010FAD1957E; S3 Extended Request ID: mC8EZfdWuubobpZndQGLi38t84Ni9EmDMc6kJzburYBGLhZijEtu5BUPluC4aq5hDHd5HcEN2l0=)"
    }
  },
  "status": 404
}

I tried to create a new snapshot with the missing name in order to delete it, hoping it would override the existing snapshot somehow. I get the following error, saying the snapshot already exists!

{
  "error": {
    "root_cause": [
      {
        "type": "invalid_snapshot_name_exception",
        "reason": "[dcselasticsnapshot:citydirectory_ocrpages] Invalid snapshot name [citydirectory_ocrpages], snapshot with the same name already exists"
      }
    ],
    "type": "invalid_snapshot_name_exception",
    "reason": "[dcselasticsnapshot:citydirectory_ocrpages] Invalid snapshot name [citydirectory_ocrpages], snapshot with the same name already exists"
  },
  "status": 400

}

Is there a way to delete this, so the curator script runs properly? Or even so listing all the snapshots in the repo works? I believe this is the case of a missing meta pointer. I tried looking in the repo for the actual data associated with the snapshot, but couldn't find it. If I was able to find it, would deleting the individual data/meta for the snapshot clear up this problem? FYI I'm trying to avoid deleted the entire repo and starting over, as there are many current/useful backups located there. Thanks for any/all help.

4 node cluster, 3 data, 1 master. s3 backup is on AWS - bucket is named 'dcselasticsnapshot'.

A 500 error indicates something is wrong at the server level. Elasticsearch is perhaps having difficulty communicating with S3.

What happens if you run:

GET /_cat/snapshots/dcselasticsnapshot/

...without the extra stuff?

For that matter, what do you get if you run:

GET /_cat/repositories

Hey @theuntergeek, thanks for responding. When I run:

GET /_cat/snapshots/dcselasticsnapshot/

I receive the following response:

{
  "error": {
    "root_cause": [
      {
        "type": "snapshot_missing_exception",
        "reason": "[dcselasticsnapshot:citydirectory_ocrpages/XGftDvzIRPGe_wMfN6FsSw] is missing"
      }
    ],
    "type": "snapshot_exception",
    "reason": "[dcselasticsnapshot:citydirectory_ocrpages/XGftDvzIRPGe_wMfN6FsSw] Snapshot could not be read",
    "caused_by": {
      "type": "snapshot_missing_exception",
      "reason": "[dcselasticsnapshot:citydirectory_ocrpages/XGftDvzIRPGe_wMfN6FsSw] is missing",
      "caused_by": {
        "type": "no_such_file_exception",
        "reason": "Blob object [snap-XGftDvzIRPGe_wMfN6FsSw.dat] not found: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: F53FFCB60B165355; S3 Extended Request ID: HLrbreo3Y/a0w1mHTqtLx83AZH58Mm9li1lEjRzK+FsMAWfEqOYLM7jfIrEVuyERjPf5c1dJSZU=)"
      }
    }
  },
  "status": 500
}

When I run:

GET /_cat/repositories?v

I receive the following response:

id                 type
dcselasticsnapshot   s3

Let me know if there's anything else you need. I look forward to hearing back from you, thanks! And FYI I'm currently still putting new snapshots in this s3 repo, with success in restoring them in other clusters.

You may be stuck there. The issue sounds to me like something changed or did not propagate properly in the S3 bucket that the repository metadata thinks should be there. I personally know of no way to correct that. If it were me, I'd start with a fresh repository in a different bucket.

1 Like

Oh No- I had a feeling you were going to say that :no_mouth:. Do I need to create the repo in a different bucket in order to fix the problem, or can I just delete what's in the existing bucket manually or via commands to the cluster.

If you delete the contents of the bucket, and also delete the repository from Elasticsearch, you should be able to re-use it by re-creating the repository.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.