Unable to delete snapshot

In our environment, having 2 different elasticsearch servers running on eks cluster and leveraging elasticsearch-curator tool to take periodic snapshot on AWS s3 bucket.

On both the ES servers, the elasticsearch-curator was configured to use the same snapshot and repository name in the same underlying s3 bucket path to take snapshots at the same time. And due to the same curator config, snapshots are not working on creating one of the ES clusters.

Post changing the snapshot name and s3 base path for one of the curator jobs, creating a snapshot issue has been resolved on both ES servers. But having issues with deleting the snapshot.

Deleting any snapshot is failing with the below error.

DELETE _snapshot/s3_repository_hourly/snapshot-2023-06-16-01-00

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_state_exception",
        "reason" : "Duplicate key snapshot-2023-06-16-04-00 (attempted merging values snapshot-2023-06-16-04-00/mkqC3EnbQxSgCeOSAIpWRA and snapshot-2023-06-16-04-00/-w2zPgxfSA2vLBkHYyNWXw)"
      }
    ],
    "type" : "illegal_state_exception",
    "reason" : "Duplicate key snapshot-2023-06-16-04-00 (attempted merging values snapshot-2023-06-16-04-00/mkqC3EnbQxSgCeOSAIpWRA and snapshot-2023-06-16-04-00/-w2zPgxfSA2vLBkHYyNWXw)"
  },
  "status" : 500
  ==========================
 
  DELETE _snapshot/s3_repository_hourly/snapshot-2023-06-16-02-00
 
  "type" : "illegal_state_exception",
    "reason" : "Duplicate key snapshot-2023-06-16-04-00 (attempted merging values snapshot-2023-06-16-04-00/mkqC3EnbQxSgCeOSAIpWRA and snapshot-2023-06-16-04-00/-w2zPgxfSA2vLBkHYyNWXw)"
   
    ==========================
   
    DELETE _snapshot/s3_repository_hourly/snapshot-2023-06-16-04-00
   
    "type" : "illegal_state_exception",
    "reason" : "Duplicate key snapshot-2023-06-16-04-00 (attempted merging values snapshot-2023-06-16-04-00/mkqC3EnbQxSgCeOSAIpWRA and snapshot-2023-06-16-04-00/-w2zPgxfSA2vLBkHYyNWXw)"

Deleting any snapshot is pointing to a specific snapshot is the error. Ex- Duplicate key snapshot-2023-06-16-04-00

Looking for below inputs to clean up snapshot snapshot-2023-06-16-04-00 from the repo and any suggestion to test on how to fix this issue?

Also understood one repo can't be used for multiple clusters but in this kind of scenarios, we need to have an way to smoothly delete an snapshot.

I don't think there is a way. Unfortunately if you writing to the same repository from multiple clusters then there's a risk that eventually they will leave the repository in an unrecoverable state. You will need to create fresh repositories for each of your clusters and discard the broken one.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.