Hello, my cluster is currently in red state due to one of both the primary and replica shard of an index becoming unassigned. This happened after a number of large tasks were executed simultaneously by accident on the same index alias. The index in question that was affected was the current write index of that index alias. The index contains valuable data and I would like to try to minimise any data loss if possible, ideally with none.
Calling GET _cluster/allocation/explain on the the primary returns unassigned_info
->reason
:
"ALLOCATION_FAILED"
and allocate_explanation
:
"cannot allocate because all found copies of the shard are either stale or corrupt"
finally within unassigned_info
->details
:
""failed shard on node [<node_id>]: shard failure, reason [merge failed], failure NotSerializableExceptionWrapper[merge_exception: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?)..."
Calling the same on the replica returns:
unassigned_info
->reason
: "ALLOCATION_FAILED"
allocate_explanation
:"cannot allocate because allocation is not permitted to any of the nodes"
unassigned_info
->details
:
"failed shard on node [<node_id>]: failed to perform indices:data/write/bulk[s] on replica [<index_name>][<shard_num>], node[<node_id>], [R], s[STARTED], a[id=<>], failure IndexShardClosedException[CurrentState[CLOSED] Primary closed.]"
I attempted a dry run of manually reallocating the replica using the reroute API, and received a status 400 with: "[allocate_replica] trying to allocate a replica shard [<index_name>][<shard_num>], while corresponding primary shard is still unassigned"
What is the best course of action here? I gather I need to assign the primary shard before I can do anything with the replica. I am concerned given the CorruptIndex exception that the primary shard (and potentially the replica too..) has suffered data losses, so my thinking was that recovering from the replica was my best bet? Is my understanding incorrect here / am I going to have to be content with data losses?
Many thanks