NoSuchFileException: translog-5.ckp


(Alec) #1

Hello,

I am using Elasticsearch 2.2.0 and ran into a problem similar to Cannot recover index because of missing tanslog files but with .ckp file instead of a .tlog file. The most likely cause of the problem was that the node ran out of disk space. I have not deleted any tlog or ckp files and I show the exception and the content of the relevant folder below. This is a test machine, so I am not that concerned about saving data as much as I would like to know what to do if this happens in PROD environment. We watch disk space in PROD, but nevertheless :slight_smile:

ls -lah /opt/my_cluster/nodes/0/indices/logstash-syslog-myapp-2016.04.24/3/translog/
total 592K
drwxr-xr-x 2 elasticsearch elasticsearch 4.0K Apr 29 00:54 .
drwxr-xr-x 5 elasticsearch elasticsearch 4.0K Apr 24 03:06 ..
-rw-r--r-- 1 elasticsearch elasticsearch 20 Apr 24 13:35 translog-4.ckp
-rw-r--r-- 1 elasticsearch elasticsearch 316K Apr 24 17:22 translog-5.tlog
-rw-r--r-- 1 elasticsearch elasticsearch 260K Apr 24 17:22 translog-6.tlog
-rw-r--r-- 1 elasticsearch elasticsearch 20 Apr 24 17:22 translog.ckp

[logstash-syslog-myapp-2016.04.24][[logstash-syslog-myapp-2016.04.24][3]] IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: NoSuchFileException[/opt/my_cluster/nodes/0/indices/logstash-syslog-myapp-2016.04.24/3/translog/translog-5.ckp];
at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:254)
at org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: [logstash-syslog-myapp-2016.04.24][[logstash-syslog-myapp-2016.04.24][3]] EngineCreationFailureException[failed to create engine]; nested: NoSuchFileException[/opt/my_cluster/nodes/0/indices/logstash-syslog-myapp-2016.04.24/3/translog/translog-5.ckp];
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:156)
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1450)
at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1434)
at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:925)
at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:897)
at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:245)
... 5 more
Caused by: java.nio.file.NoSuchFileException: /opt/my_cluster/nodes/0/indices/logstash-syslog-myapp-2016.04.24/3/translog/translog-5.ckp
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.newByteChannel(Files.java:407)
at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
at java.nio.file.Files.newInputStream(Files.java:152)
at org.elasticsearch.index.translog.Checkpoint.read(Checkpoint.java:82)
at org.elasticsearch.index.translog.Translog.recoverFromFiles(Translog.java:330)
at org.elasticsearch.index.translog.Translog.(Translog.java:179)
at org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:209)
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:152)
... 11 more


(Igor Masternoy) #2

Few months has passed. Do you have any updates? I stuck with the same issue, with only difference: my disk is not full it's not stable and sometimes it can disconnect.


(system) #3