Translog files corrupted, cluster failing to recover

tarantin · August 19, 2016, 2:11pm

After a system crash (disk issues) of a 2 node clusters (shard replication configured to 0), some of the translog tlog files were corrupted/missing in lots of my index. I was wondering if there is a way to recover those indexes event with some tlog file missing ?

In the logs, at the moment, I'm getting the following message when restarting the nodes:

[2016-08-19 14:06:04,511][WARN ][indices.cluster ] [Hyperion] [[myindex][3]] marking and sending shard failed due to [failed recovery]
[myindex][[myindex][3]] IndexShardRecoveryException[failed recovery]; nested: IllegalStateException[translog file doesn't exist with generation: 2 lastCommitted: -1 checkpoint: 8 - translog ids must be consecutive];
at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:179)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: translog file doesn't exist with generation: 2 lastCommitted: -1 checkpoint: 8 - translog ids must be consecutive
at org.elasticsearch.index.translog.Translog.recoverFromFiles(Translog.java:328)
at org.elasticsearch.index.translog.Translog.(Translog.java:179)
at org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:208)
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:151)
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1509)
at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1493)
at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:966)
at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:938)
at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:241)
at org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
... 3 more

The problem occured on Elasticsearch 2.1 and I tried to solve the problem by migrating to 2.3.5 but it didn't solve the problem.

warkolm · August 20, 2016, 7:26am

Bad translogs are not really a version specific issue. You can either process the original data again, or accept that this data is lost

Topic		Replies	Views
Translog is corrupted Elasticsearch	3	3342	November 1, 2021
How to recover indices from missing translog file Elasticsearch	2	3047	July 5, 2017
Accidentally deleted translog for index Elasticsearch	2	1058	March 3, 2019
NoSuchFileException translog.ckp missing problem Elasticsearch	1	653	May 31, 2018
Elastic Translog corrupted error (Unassigned shards) Elasticsearch	2	169	November 15, 2023

Translog files corrupted, cluster failing to recover

Related topics