ES version - 6.x
ES cluster is red and getting the below error
sh-4.2# curl -k -XGET https://127.0.0.1:9200/_cluster/allocation/explain?pretty^M
{^H
"index" : "index1",^M
"shard" : 2,^M
"primary" : true,^M
"current_state" : "unassigned",^M
"unassigned_info" : {^M
"reason" : "ALLOCATION_FAILED",^M
"at" : "2022-09-28T06:08:02.154Z",^M
"failed_allocation_attempts" : 5,^M
"details" : "failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[index1][2]: obtaining shard lock timed out after 5000ms]; ",^M
"last_allocation_status" : "no"^M
},^M
"can_allocate" : "no",^M
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy",^M
"node_allocation_decisions" : [^M
{^M
"node_id" : "K_nIcsssdssRvQIagrsf2QLkfIQ",^M
"node_name" : "K_nIcsR",^M
"transport_address" : "localhost:9300",^M
"node_decision" : "no",^M
"store" : {^M
"in_sync" : false,^M
"allocation_id" : "PW4oAHGAT9KLvL24_GEjSQ"^M
}^M
},^M
{^M
"node_id" : "uOPt4GKBsfsfSsyuVLVu-IRZ-g",^M
"node_name" : "uOPtsff4GK",^M
"transport_address" : "localhost:9300",^M
"node_decision" : "no",^M
"store" : {^M
"in_sync" : true,^M
"allocation_id" : "sNdzxssfsfsTK4SV6PPz16z6gA4Q"^M
},^M
"deciders" : [^M
{^M
"decider" : "max_retry",^M
"decision" : "NO",^M
"explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2022-09-28T06:08:02.154Z], failed_attempts[5], delayed=false, details[failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ise][2]: obtaining shard lock timed out after 5000ms]; ], allocation_status[deciders_no]]]"^M
}^M
]^M
}^M
]^M```
Any reason what can be the cause for this issue, we are not able to reproduce this in all our setups, only some setups of them are having this issue.I saw couple forums which suggested to try reroute and increase max tries. My question is once we set reroute to true and max retries to 15; will it the change be there always and when ever there is sync issue after 15 retries will reroute automatically happen beacuse I see that they are telling manually we have to do everytime. Please clarify this for me. Below is what I am planning to suggest.
curl -XPOST 'localhost:9200/_cluster/reroute?retry_failed’
curl --silent --request PUT --header 'Content-Type: application/json' 127.0.0.1:9200/ise/_settings?pretty=true --data-ascii '{
"index": {
"allocation": {
"max_retries": 15
}
}
}'
Thanks