Accidentally deleted translog for index

jtalmi · February 2, 2019, 7:37pm

This is related to a previous issue I encountered where memory issues caused a corrupt index due to empty checkpoints in the translog: Failed shard after OOMing, corrupt index

I encountered the same issue the other day but when I was deleting the empty checkpoints I accidentally deleted the original translog file for one index. Now I get the following error when trying to reallocate shards.

Here's the output from the shard allocation explain api:

{"index":"cb_twitter2","shard":1,"primary":true,"current_state":"unassigned","unassigned_info":{"reason":"ALLOCATION_FAILED","at":"2019-01-28T06:28:03.978Z","failed_allocation_attempts":5,"details":"failed shard on node [CxXWE8BiQbS4ThB9AvvGQA]: failed recovery, failure RecoveryFailedException[[cb_twitter2][1]: Recovery failed on {node-1}{CxXWE8BiQbS4ThB9AvvGQA}{CuurlF8hQDCO5oAV4BwwUA}{10.142.0.2}{10.142.0.2:9300}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: CorruptIndexException[misplaced codec footer (file truncated?): length=0 but footerLength==16 (resource=SimpleFSIndexInput(path=\"/var/lib/elasticsearch/nodes/0/indices/W_YMhmHHSCCB7a1wfU-gHA/1/translog/translog.ckp\"))]; ","last_allocation_status":"no"},"can_allocate":"no","allocate_explanation":"cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy","node_allocation_decisions":[{"node_id":"CxXWE8BiQbS4ThB9AvvGQA","node_name":"node-1","transport_address":"10.142.0.2:9300","node_decision":"no","store":{"in_sync":true,"allocation_id":"yGfbZnSrTTGubsbcyPvvgw"},"deciders":[{"decider":"max_retry","decision":"NO","explanation":"shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2019-01-28T06:28:03.978Z], failed_attempts[5], delayed=false, details[failed shard on node [CxXWE8BiQbS4ThB9AvvGQA]: failed recovery, failure RecoveryFailedException[[cb_twitter2][1]: Recovery failed on {node-1}{CxXWE8BiQbS4ThB9AvvGQA}{CuurlF8hQDCO5oAV4BwwUA}{10.142.0.2}{10.142.0.2:9300}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: CorruptIndexException[misplaced codec footer (file truncated?): length=0 but footerLength==16 (resource=SimpleFSIndexInput(path=\"/var/lib/elasticsearch/nodes/0/indices/W_YMhmHHSCCB7a1wfU-gHA/1/translog/translog.ckp\"))]; ], allocation_status[deciders_no]]]"}]}]}

Is there anything I can do to recover the index? I tried creating an empty translog.ckp file but that's why it's throwing this error instead of "missing file"

warkolm · February 3, 2019, 12:07am

Do you have a snapshot you can restore, or a replica you can promote?

Otherwise you may have lost the data in that shard.

system · March 3, 2019, 12:07am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elastic Translog corrupted error (Unassigned shards) Elasticsearch	2	169	November 15, 2023
How do we recover from corruption in transaction log file? Elasticsearch	4	479	July 19, 2022
Translog files corrupted, cluster failing to recover Elasticsearch	2	1707	July 5, 2017
Failed shard after OOMing, corrupt index Elasticsearch	6	5429	December 7, 2018
ES 6.2.4 - ALLOCATION_FAILED TranslogCorruptedException Elasticsearch	2	5997	August 31, 2018

Accidentally deleted translog for index

Related topics