Snapshot stuck IN_PROGRESS


(Markuchi) #1

Running a snapshot has taken days and has been IN_PROGRESS state

During the snapshot we hit the low watermark and want to stop the snapshot and redo one we fix the watermark.

We have performed rolling restarts multiple times as well as try and delete the snapshots.
Deleting snapshots just hang and keep going to more than 24 hours.

"state" : "IN_PROGRESS",
"start_time" : "2018-09-11T22:00:21.694Z",
"start_time_in_millis" : 1536703221694,
"end_time" : "1970-01-01T00:00:00.000Z",
"end_time_in_millis" : 0,
"duration_in_millis" : -1536703221694,
"failures" : [ ],
"shards" : {
"total" : 0,
"failed" : 0,
"successful" : 0

running _status on the snapshot shows some FAILURE:

"shards" : {
"0" : {
"stage" : "FAILURE",
"stats" : {
"number_of_files" : 0,
"processed_files" : 0,
"total_size_in_bytes" : 0,
"processed_size_in_bytes" : 0,
"start_time_in_millis" : 0,
"time_in_millis" : 0
}
},

Specifications

  • Version: Curator 5.5.4, ElasticSearch 5.6.10
  • Platform: CentOS 7

Any assistance will be appreciated.


(Markuchi) #2

bump


(Aaron Mildenstein) #3

I changed the title. Curator is a red herring here, and it may turn away potential help.


(Markuchi) #4

Thanks.
I finally fixed the issue.

I tried closing indexes and found that there were a few that wouldn't close.
I did a rolling restart but they still wouldn't close.
I then restarted all servers in the cluster at the same time and the indexes finally closed.
Snapshot was still running even after this.
Reopened all the closed indexes and the snapshot was finally no longer running.

Running a snapshot now completes.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.