Possible to manually delete S3 backup repository?


(Patrick Hannigan) #1

Hi all - we've started backing up our elasticsearch cluster into a new S3 bucket/repo and would like to delete the previous bucket/repo that we were using. However, when we try to delete the old repo, we're getting the error:

{"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[MASTER-ip-10-0-0-71][10.0.0.71:9300][cluster:admin/repository/delete]"}],"type":"illegal_state_exception","reason":"trying to modify or unregister repository that is currently used "},"status":500}

We've found that there's a stuck partial restore from months ago that we can't seem to cancel.

We don't care about that restore or this repo. Can we safely just delete the S3 bucket and manually delete the pointers to it on the elasticsearch side?

Any ideas are greatly appreciated!

Patrick

Edit: at the suggestion of Elastic support, I've also opened up an issue on Github: https://github.com/elastic/elasticsearch/issues/26917


(Patrick Hannigan) #2

Bumping this to see if anyone has insight into the problems we're encountering above.


(Tanguy) #3

Hi Patrick,

I saw your issue on Github: https://github.com/elastic/elasticsearch/issues/26917

On 2.1.1, I think that the only way to get rid of this stuck restore task is to restart the cluster. After the restart, you should be able to delete the repository on Elasticsearch and then on S3.


(Patrick Hannigan) #4

@tanguy I've rolling-restarted all the nodes on our cluster and I'm still seeing the restore in state 'STARTED' when I visit http://elasticsearch.hive.co:9200/_cluster/state/customs

Is there anything else I need to do to clear out the restore? Anything I'm missing here?

Thanks for your help!


(Tanguy) #5

Can you please share the JSON part from /_cluster/state/customs that concerns the stucked restore please?


(Patrick Hannigan) #6

Here's the output: https://gist.github.com/patmanh/50695e4c5d46db7ede49706309602052


(Patrick Hannigan) #7

@tanguy any more ideas on your end? Thanks!


(Tanguy) #8

I think that the only solution to get rid of this stucked snapshot is to do a full cluster restart, ie shutdown all your nodes and start them again.


(Patrick Hannigan) #9

@tanguy got it. Can you forsee any obvious complications if we leave the stuck snapshot, and just delete the bucket from S3?


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.