How do we recover from corruption in transaction log file?

There has been corruption of transaction log in the environment due to which the indices are in red. Because of which we are unable to move forward with indexing new items and the backlog for the items to index keep increasing.

Adding the error details below
{
"note" : "No shard was specified in the explain API request, so this response explains a randomly chosen unassigned shard. There may be other unassigned shards in this cluster which cannot be assigned for different reasons. It may not be possible to assign this shard until one of the other shards is assigned correctly. To explain the allocation of other shards (whether assigned or unassigned) you must specify the target shard in the request to this API.",
"index" : "Index_Name",
"shard" : 2,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "ALLOCATION_FAILED",
"at" : "2022-06-07T12:44:18.956Z",
"failed_allocation_attempts" : 5,
"details" : "failed shard on node [20OJCb-iQayvsxuXhtUbBQ]: shard failure, reason [failed to recover from translog], failure EngineException[failed to recover from translog]; nested: TranslogCorruptedException[translog from source [\oi.bdl.lu\partage\Evault_Index\ElasticIndex\nodes\0\indices\KVX3QvZwQzORDosOJJm4tw\2\translog\translog-3796.tlog] is corrupted, operation size must be at least 4 but was: 0]; ",
"last_allocation_status" : "no"
},
"can_allocate" : "yes",
"allocate_explanation" : "can allocate the shard",
"target_node" : {
"id" : "20OJCb-iQayvsxuXhtUbBQ",
"name" : "-----",
"transport_address" : "127.0.0.1:9202",
"attributes" : {
"xpack.installed" : "true",
"transform.node" : "false"
}
},
"allocation_id" : "1454Ln2MSdy0MGFFpywN5w",
"node_allocation_decisions" : [
{
"node_id" : "20OJCb-iQayvsxuXhtUbBQ",
"node_name" : "----",
"transport_address" : "127.0.0.1:9202",
"node_attributes" : {
"xpack.installed" : "true",
"transform.node" : "false"
},
"node_decision" : "yes",
"store" : {
"in_sync" : true,
"allocation_id" : "1454Ln2MSdy0MGFFpywN5w"
}
}
]
}

My question her is that how do we recover from such transaction log corruption and bring the indices back to green state. And how to avoid data loss while we recover from such situations

I recommend recovering this index from a recent snapshot.

This error suggests that there's something wrong with your storage - your OS confirmed to Elasticsearch that this data was durably written, and yet the data was lost afterwards. You'll need to dig into that to prevent it happening again.

Hi @DavidTurner ,
We don't have any Snapshot right now in our environment.
Is there an way to move forward in such case. Like any other way than the snapshot to resolve this issue

I think you'll have to delete the index and create it again from scratch.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.