NoSuchFileException of the "translog.ckp" file

I wrote a post several days before: https://discuss.elastic.co/t/help-all-shards-failed-regularly/211045. The problem still exists after I tried the methods mentioned in the replies. Can anyone help us solve this weird and exhausting problem?

I tried all the ways I can think of but all failed:

  1. Increase disk space. (Now the disk usage rate is only around 10%)
  2. Double CPU number and memory space. It seems there is sufficient memory space in the server.
  3. Add another node. This is the method suggested by the people of ES in that post. It turns out that the two nodes broke down one by one due to the same cause.
  4. Upgrade the java API version in our code to 5.6.1 and use the 5.6.1 version in the server.

Now we really do not know what to do to find out the root cause of the problem, not mention to solve it. It just occurs after running for some time all of a sudden. I really do not understand why the "translog.ckp" file just disappeared again and again automatically.

Sorry, I sound a bit pessimistic. I am really frustrated now...

If you are still seeing the same issue even after having increased the amount of storage available, I would suspect the type of storage used by the VM behind the scenes. I would recommend trying to deploy am a VM with better storage to see if that makes any difference. You may also want to upgrade to a more recent version as resiliency has improved over time.

1 Like

Thanks Christian!

Does your first recommendation mean I should purchase a new server to run ES only?

As for your second recommendation, the latest ES version is 7.5.1. But the latest java APIs version is only 6.0.0 on Maven. It seems they cannot work together.

would suspect the type of storage used by the VM behind the scenes

I'm +1 on this. It's highly likely that the storage used here is broken in some way. What file system is this, is it an NFS mount of some form maybe?

Not sure what you mean here, but you can continue to use the transport client in 7.5.1 for the time being (though note that it's going away in 8), the dependency can be found on Maven here.

The file system is "ext4". This is what the support engineer of the cloud server provider told me.

As for the version problem, I made a mistake by saying that the latest api version on Maven repository is only 6.0.0 (it was so only for the x-pack-transport dependency).

What do you suggest me to do next: put the ES on a different server? upgrade the version to 7.5.1? increase the storage space further? dig into the details about the "tragic event" that caused the mysterious missing of the translog.ckp file? or something else?

I am totally lost right now. The problem continued for weeks. It just broke down in the afternoon everyday.

increase the storage space further?

I don't think this will help. If you're not out of disk space this isn't the issue.

upgrade the version to 7.5.1?

This seems like a good idea in general since we've made stability improvements but I'm not sure it will fix your specific issue. I don't know of a specific bug that would repeatedly cause exactly this failure => making it less likely that just moving to a newer version will fix all your troubles.

put the ES on a different server?

This seems like the most promising experiment you can do. Put ES on a different server with different storage and see if it behaves (maybe try a different Cloud provider or so to make sure it's actually different infrastructure).

dig into the details about the "tragic event" that caused the mysterious missing of the translog.ckp file?

This is also something to look into for sure. I'd investigate the system logs maybe to see if there's anything suspicious going on with the file system (any errors, warnings). Are there any other errors in the ES log around the time you run into the missing file?

Unfortunately @weiwei107 you elided all the interesting parts of the stack traces in your earlier message. Would you share the complete stack traces? The only useful line in what you shared is this one:

	at org.elasticsearch.index.translog.Checkpoint.write(Checkpoint.java:127) ~[elasticsearch-5.4.1.jar:5.4.1]

This tells us that the failure is happening while trying to write the checkpoint, so a NoSuchFileException indicates that the containing directory doesn't exist. Elasticsearch always creates this directory first so this very much suggests that your storage is behaving in a lazy or eventually-consistent fashion and not like a local ext4 filesystem.

2 Likes

The full stack trace is pasted below. I pasted two different days and versions'. Basically if you look at the elasticsearch.log file everyday, it all starts like this. This problem just came from nowhere suddenly.

Another puzzling phenomenon is that no matter when I delete the data and restart the ES and our website project, it always breaks down in the afternoon. I suspected if it is due to the continuing consumption of disk space and memory by the running of our website project. But the "precision" of the breakdown timing, everyday's afternoon regardless of the starting time, seemed to deny this assumption.

[2019-12-22] Version: 5.6.1

    [2019-12-22T16:34:57,461][WARN ][o.e.i.e.Engine           ] [107room-node-1] [107room][3] failed engine [already closed by tragic event on the translog]
java.nio.file.NoSuchFileException: /107room/elasticsearch-5.6.1/data/nodes/0/indices/JQMgbBerSD2YcPEI8BYWJQ/3/translog/translog.ckp
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
	at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
	at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
	at org.elasticsearch.index.translog.Checkpoint.write(Checkpoint.java:127) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.translog.TranslogWriter.writeCheckpoint(TranslogWriter.java:312) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.translog.TranslogWriter.syncUpTo(TranslogWriter.java:273) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:519) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:544) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.shard.IndexShard$1.write(IndexShard.java:1687) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:107) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcess(AsyncIOProcessor.java:99) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:82) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.shard.IndexShard.sync(IndexShard.java:1709) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:307) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportWriteAction$WritePrimaryResult.<init>(TransportWriteAction.java:111) ~[elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:125) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:70) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:975) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:944) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:345) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:270) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:924) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:921) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:151) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1659) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:933) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:92) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:291) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:266) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:248) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:644) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) [elasticsearch-5.6.1.jar:5.6.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.6.1.jar:5.6.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
[2019-12-22T16:34:57,481][WARN ][o.e.i.c.IndicesClusterStateService] [107room-node-1] [[107room][3]] marking and sending shard failed due to [shard failure, reason [already closed by tragic event on the translog]]
    java.nio.file.NoSuchFileException: /107room/elasticsearch-5.6.1/data/nodes/0/indices/JQMgbBerSD2YcPEI8BYWJQ/3/translog/translog.ckp
    	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
    	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
    	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
    	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
    	at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
    	at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
    	at org.elasticsearch.index.translog.Checkpoint.write(Checkpoint.java:127) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.TranslogWriter.writeCheckpoint(TranslogWriter.java:312) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.TranslogWriter.syncUpTo(TranslogWriter.java:273) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:519) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:544) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShard$1.write(IndexShard.java:1687) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:107) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcess(AsyncIOProcessor.java:99) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:82) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShard.sync(IndexShard.java:1709) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:307) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportWriteAction$WritePrimaryResult.<init>(TransportWriteAction.java:111) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:125) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:70) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:975) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:944) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:345) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:270) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:924) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:921) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:151) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1659) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:933) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:92) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:291) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:266) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:248) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:644) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
    [2019-12-22T16:34:57,485][WARN ][o.e.c.a.s.ShardStateAction] [107room-node-1] [107room][3] received shard failed for shard id [[107room][3]], allocation id [-WOBlipwQ96HbY3SLr3CxQ], primary term [0], message [shard failure, reason [already closed by tragic event on the translog]], failure [NoSuchFileException[/107room/elasticsearch-5.6.1/data/nodes/0/indices/JQMgbBerSD2YcPEI8BYWJQ/3/translog/translog.ckp]]
    java.nio.file.NoSuchFileException: /107room/elasticsearch-5.6.1/data/nodes/0/indices/JQMgbBerSD2YcPEI8BYWJQ/3/translog/translog.ckp
    	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
    	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
    	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
    	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
    	at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
    	at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
    	at org.elasticsearch.index.translog.Checkpoint.write(Checkpoint.java:127) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.TranslogWriter.writeCheckpoint(TranslogWriter.java:312) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.TranslogWriter.syncUpTo(TranslogWriter.java:273) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:519) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:544) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShard$1.write(IndexShard.java:1687) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:107) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcess(AsyncIOProcessor.java:99) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:82) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShard.sync(IndexShard.java:1709) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:307) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportWriteAction$WritePrimaryResult.<init>(TransportWriteAction.java:111) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:125) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:70) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:975) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:944) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:345) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:270) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:924) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:921) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:151) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1659) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:933) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:92) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:291) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:266) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:248) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:644) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.1.jar:5.6.1]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
    [2019-12-22T16:34:57,504][INFO ][o.e.c.r.a.AllocationService] [107room-node-1] Cluster health status changed from [YELLOW] to [RED] (reason: [shards failed [[107room][3]] ...]).
 [2019-12-22T16:34:57,592][WARN ][o.e.i.c.IndicesClusterStateService] [107room-node-1] [[107room][3]] marking and sending shard failed due to [failed recovery]
        org.elasticsearch.indices.recovery.RecoveryFailedException: [107room][3]: Recovery failed on {107room-node-1}{V0P5TAFsSciUxb8oTqNtJg}{bLhy2tMdRrWBjYFqR_jTyw}{127.0.0.1}{127.0.0.1:9300}
        	at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$1(IndexShard.java:1488) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.1.jar:5.6.1]
        	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
        	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
        Caused by: org.elasticsearch.index.shard.IndexShardRecoveryException: failed to recover from gateway
        	at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:365) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:90) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:257) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:88) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1236) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$1(IndexShard.java:1484) ~[elasticsearch-5.6.1.jar:5.6.1]
        	... 4 more
        Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: failed to create engine
        	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:163) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1602) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1584) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:1027) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:987) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:360) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:90) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:257) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:88) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1236) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$1(IndexShard.java:1484) ~[elasticsearch-5.6.1.jar:5.6.1]
        	... 4 more
        Caused by: java.nio.file.NoSuchFileException: /107room/elasticsearch-5.6.1/data/nodes/0/indices/JQMgbBerSD2YcPEI8BYWJQ/3/translog/translog.ckp
        	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
        	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
        	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
        	at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[?:?]
        	at java.nio.file.Files.newByteChannel(Files.java:361) ~[?:1.8.0_161]
        	at java.nio.file.Files.newByteChannel(Files.java:407) ~[?:1.8.0_161]
        	at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:77) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        	at org.elasticsearch.index.translog.Checkpoint.read(Checkpoint.java:92) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.translog.Translog.readCheckpoint(Translog.java:1357) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.translog.Translog.<init>(Translog.java:161) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:272) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:160) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1602) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1584) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:1027) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:987) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:360) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:90) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:257) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:88) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1236) ~[elasticsearch-5.6.1.jar:5.6.1]
        	at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$1(IndexShard.java:1484) ~[elasticsearch-5.6.1.jar:5.6.1]
        	... 4 more

[2019-12-20] Version: 5.4.1

   [2019-12-20T12:44:06,870][INFO ][o.e.n.Node               ] [107room-node-1] stopping ...
    [2019-12-20T12:44:07,004][INFO ][o.e.n.Node               ] [107room-node-1] stopped
    [2019-12-20T12:44:07,004][INFO ][o.e.n.Node               ] [107room-node-1] closing ...
    [2019-12-20T12:44:07,022][INFO ][o.e.n.Node               ] [107room-node-1] closed
    [2019-12-20T12:48:13,583][INFO ][o.e.n.Node               ] [107room-node-1] initializing ...
    [2019-12-20T12:48:13,791][INFO ][o.e.e.NodeEnvironment    ] [107room-node-1] using [1] data paths, mounts [[/107room (/dev/xvdb1)]], net usable_space [8.1gb], net total_space [9.9gb], spins? [no], types [ext4]
    [2019-12-20T12:48:13,792][INFO ][o.e.e.NodeEnvironment    ] [107room-node-1] heap size [5.9gb], compressed ordinary object pointers [true]
    [2019-12-20T12:48:13,830][INFO ][o.e.n.Node               ] [107room-node-1] node name [107room-node-1], node ID [BsWjLj3USFivENqD7p6f9Q]
    [2019-12-20T12:48:13,830][INFO ][o.e.n.Node               ] [107room-node-1] version[5.4.1], pid[2061], build[2cfe0df/2017-05-29T16:05:51.443Z], OS[Linux/3.2.0-29-generic/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_161/25.161-b12]
    [2019-12-20T12:48:13,831][INFO ][o.e.n.Node               ] [107room-node-1] JVM arguments [-Xms6g, -Xmx6g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+DisableExplicitGC, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -XX:+PrintGCDetails, -XX:+PrintGCTimeStamps, -XX:+PrintGCDateStamps, -XX:+PrintClassHistogram, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Des.path.home=/107room/elasticsearch]
    [2019-12-20T12:48:15,631][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [aggs-matrix-stats]
    [2019-12-20T12:48:15,631][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [ingest-common]
    [2019-12-20T12:48:15,631][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [lang-expression]
    [2019-12-20T12:48:15,631][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [lang-groovy]
    [2019-12-20T12:48:15,632][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [lang-mustache]
    [2019-12-20T12:48:15,632][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [lang-painless]
    [2019-12-20T12:48:15,632][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [percolator]
    [2019-12-20T12:48:15,632][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [reindex]
    [2019-12-20T12:48:15,632][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [transport-netty3]
    [2019-12-20T12:48:15,632][INFO ][o.e.p.PluginsService     ] [107room-node-1] loaded module [transport-netty4]
    [2019-12-20T12:48:15,633][INFO ][o.e.p.PluginsService     ] [107room-node-1] no plugins loaded
    [2019-12-20T12:48:18,793][INFO ][o.e.d.DiscoveryModule    ] [107room-node-1] using discovery type [zen]
    [2019-12-20T12:48:19,621][INFO ][o.e.n.Node               ] [107room-node-1] initialized
    [2019-12-20T12:48:19,621][INFO ][o.e.n.Node               ] [107room-node-1] starting ...
    [2019-12-20T12:48:19,965][INFO ][o.e.t.TransportService   ] [107room-node-1] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
    [2019-12-20T12:48:19,977][WARN ][o.e.b.BootstrapChecks    ] [107room-node-1] max file descriptors [65535] for elasticsearch process is too low, increase to at least [65536]
    [2019-12-20T12:48:23,044][INFO ][o.e.c.s.ClusterService   ] [107room-node-1] new_master {107room-node-1}{BsWjLj3USFivENqD7p6f9Q}{nKtiwIezQmCYXm3oUyg_bw}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
    [2019-12-20T12:48:23,089][INFO ][o.e.h.n.Netty4HttpServerTransport] [107room-node-1] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}
    [2019-12-20T12:48:23,093][INFO ][o.e.n.Node               ] [107room-node-1] started
    [2019-12-20T12:48:23,410][INFO ][o.e.g.GatewayService     ] [107room-node-1] recovered [1] indices into cluster_state
    [2019-12-20T12:48:25,576][INFO ][o.e.c.r.a.AllocationService] [107room-node-1] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[107room][0]] ...]).
    [2019-12-20T12:50:45,984][INFO ][o.e.c.s.ClusterService   ] [107room-node-1] added {{107room-node-2}{j20HdwqKQMyf5h001KzuoA}{yERPmtmaSrq_cHwCQwfRnA}{127.0.0.1}{127.0.0.1:9301},}, reason: zen-disco-node-join[{107room-node-2}{j20HdwqKQMyf5h001KzuoA}{yERPmtmaSrq_cHwCQwfRnA}{127.0.0.1}{127.0.0.1:9301}]
    [2019-12-20T12:51:17,881][INFO ][o.e.c.r.a.AllocationService] [107room-node-1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[107room][4]] ...]).
    [2019-12-20T14:40:35,897][WARN ][o.e.i.e.Engine           ] [107room-node-1] [107room][0] failed engine [already closed by tragic event on the translog]
    java.nio.file.NoSuchFileException: /107room/elasticsearch/data/nodes/0/indices/kwyN5IkrTPiMsECA2jUjgA/0/translog/translog.ckp
    	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
    	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
    	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
    	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
    	at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
    	at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
    	at org.elasticsearch.index.translog.Checkpoint.write(Checkpoint.java:127) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.translog.TranslogWriter.writeCheckpoint(TranslogWriter.java:312) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.translog.TranslogWriter.syncUpTo(TranslogWriter.java:273) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:520) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:545) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.shard.IndexShard$1.write(IndexShard.java:1684) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:107) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcess(AsyncIOProcessor.java:99) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:82) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.shard.IndexShard.sync(IndexShard.java:1706) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:307) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportWriteAction$WritePrimaryResult.<init>(TransportWriteAction.java:111) ~[elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:122) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:69) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:939) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:908) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:322) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:264) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:888) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:885) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:147) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1656) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:897) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$400(TransportReplicationAction.java:93) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:281) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:260) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:252) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:627) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) [elasticsearch-5.4.1.jar:5.4.1]
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.4.1.jar:5.4.1]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
[2019-12-20T14:40:35,915][WARN ][o.e.i.c.IndicesClusterStateService] [107room-node-1] [[107room][0]] marking and sending shard failed due to [shard failure, reason [already closed by tragic event on the translog]]
java.nio.file.NoSuchFileException: /107room/elasticsearch/data/nodes/0/indices/kwyN5IkrTPiMsECA2jUjgA/0/translog/translog.ckp
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
	at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
	at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
	at org.elasticsearch.index.translog.Checkpoint.write(Checkpoint.java:127) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.TranslogWriter.writeCheckpoint(TranslogWriter.java:312) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.TranslogWriter.syncUpTo(TranslogWriter.java:273) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:520) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:545) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShard$1.write(IndexShard.java:1684) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:107) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcess(AsyncIOProcessor.java:99) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:82) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShard.sync(IndexShard.java:1706) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:307) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportWriteAction$WritePrimaryResult.<init>(TransportWriteAction.java:111) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:122) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:69) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:939) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:908) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:322) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:264) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:888) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:885) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:147) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1656) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:897) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$400(TransportReplicationAction.java:93) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:281) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:260) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:252) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:627) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.4.1.jar:5.4.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
[2019-12-20T14:40:35,920][WARN ][o.e.c.a.s.ShardStateAction] [107room-node-1] [107room][0] received shard failed for shard id [[107room][0]], allocation id [QCFRIBlQTYmXnsWHKFZ98w], primary term [0], message [shard failure, reason [already closed by tragic event on the translog]], failure [NoSuchFileException[/107room/elasticsearch/data/nodes/0/indices/kwyN5IkrTPiMsECA2jUjgA/0/translog/translog.ckp]]
java.nio.file.NoSuchFileException: /107room/elasticsearch/data/nodes/0/indices/kwyN5IkrTPiMsECA2jUjgA/0/translog/translog.ckp
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
	at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
	at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
	at org.elasticsearch.index.translog.Checkpoint.write(Checkpoint.java:127) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.TranslogWriter.writeCheckpoint(TranslogWriter.java:312) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.TranslogWriter.syncUpTo(TranslogWriter.java:273) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:520) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.translog.Translog.ensureSynced(Translog.java:545) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShard$1.write(IndexShard.java:1684) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:107) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcess(AsyncIOProcessor.java:99) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:82) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShard.sync(IndexShard.java:1706) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:307) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportWriteAction$WritePrimaryResult.<init>(TransportWriteAction.java:111) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:122) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:69) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:939) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:908) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:322) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:264) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:888) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:885) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:147) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1656) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:897) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$400(TransportReplicationAction.java:93) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:281) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:260) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:252) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:627) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.4.1.jar:5.4.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.4.1.jar:5.4.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]

Sorry, I have to split the log into different parts due to the limit of characters. Hope it is still readable.

I have found the cause of this problem. My colleague wrote a crontab job to delete outdated log files at 2pm daily after we found the disk space not enough. Somehow he made a mistake in that crontab line to include the translog file wrongly.

What a stupid mistake we have made. Thanks for your time and apologize for our carelessness!

1 Like

I have found the cause of this problem. My colleague wrote a crontab job to delete outdated log files at 2pm daily after we found the disk space not enough. Somehow he made a mistake in that crontab line to include the translog file wrongly.

What a stupid mistake we have made. Thanks for your time and apologize for our carelessness!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.