Running on 7.10
I have several indexes which have unassigned replica shards (although) the primaries are OK. For example:
{
"index" : "authm-000005",
"shard" : 1,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "ALLOCATION_FAILED",
"at" : "2021-02-15T09:19:47.045Z",
"failed_allocation_attempts" : 5,
"details" : "failed shard on node [6UDagJW2T3eWM-0PQJ0rMA]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[authm-000005][1]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [955200ms]]; ",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
All nodes give the same reason for blocking allocation.
"deciders" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2021-02-15T09:19:47.045Z], failed_attempts[5], failed_nodes[[6UDagJW2T3eWM-0PQJ0rMA]], delayed=false, details[failed shard on node [6UDagJW2T3eWM-0PQJ0rMA]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[authm-000005][1]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [955200ms]]; ], allocation_status[no_attempt]]]"
I understand that some manual intervention is needed to break the deadlock but I can't figure out what I need to do. I have been trying various reroute commands but not getting anywhere.