This is related to a previous issue I encountered where memory issues caused a corrupt index due to empty checkpoints in the translog: Failed shard after OOMing, corrupt index
I encountered the same issue the other day but when I was deleting the empty checkpoints I accidentally deleted the original translog file for one index. Now I get the following error when trying to reallocate shards.
Here's the output from the shard allocation explain api:
{"index":"cb_twitter2","shard":1,"primary":true,"current_state":"unassigned","unassigned_info":{"reason":"ALLOCATION_FAILED","at":"2019-01-28T06:28:03.978Z","failed_allocation_attempts":5,"details":"failed shard on node [CxXWE8BiQbS4ThB9AvvGQA]: failed recovery, failure RecoveryFailedException[[cb_twitter2][1]: Recovery failed on {node-1}{CxXWE8BiQbS4ThB9AvvGQA}{CuurlF8hQDCO5oAV4BwwUA}{10.142.0.2}{10.142.0.2:9300}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: CorruptIndexException[misplaced codec footer (file truncated?): length=0 but footerLength==16 (resource=SimpleFSIndexInput(path=\"/var/lib/elasticsearch/nodes/0/indices/W_YMhmHHSCCB7a1wfU-gHA/1/translog/translog.ckp\"))]; ","last_allocation_status":"no"},"can_allocate":"no","allocate_explanation":"cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy","node_allocation_decisions":[{"node_id":"CxXWE8BiQbS4ThB9AvvGQA","node_name":"node-1","transport_address":"10.142.0.2:9300","node_decision":"no","store":{"in_sync":true,"allocation_id":"yGfbZnSrTTGubsbcyPvvgw"},"deciders":[{"decider":"max_retry","decision":"NO","explanation":"shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2019-01-28T06:28:03.978Z], failed_attempts[5], delayed=false, details[failed shard on node [CxXWE8BiQbS4ThB9AvvGQA]: failed recovery, failure RecoveryFailedException[[cb_twitter2][1]: Recovery failed on {node-1}{CxXWE8BiQbS4ThB9AvvGQA}{CuurlF8hQDCO5oAV4BwwUA}{10.142.0.2}{10.142.0.2:9300}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: CorruptIndexException[misplaced codec footer (file truncated?): length=0 but footerLength==16 (resource=SimpleFSIndexInput(path=\"/var/lib/elasticsearch/nodes/0/indices/W_YMhmHHSCCB7a1wfU-gHA/1/translog/translog.ckp\"))]; ], allocation_status[deciders_no]]]"}]}]}
Is there anything I can do to recover the index? I tried creating an empty translog.ckp file but that's why it's throwing this error instead of "missing file"