Hi All,
I shall be grateful if you could help me.
I am using ES version 6.4 and have 2 node cluster setup (I know its bad, we have plans to change it). One node is in Washington and another Virginia. Everything was working fine until one day after rebooting ES host in Washington(We have dedicated link). After reboot ES fails to start and I see continuous failure logs(please see below)
I see a similar discussion here
But, how do I verify the fact which DavidTurner has given? How can I come out of this situation.
ES is in a docker container and deployed in a ESXi server.
32 GB Ram 8 Virtual Core, 50GB HDD. Free space on HDD is 12 GB. 100%CPU utilization
2020-05-31T06:24:46,134 [WARN][org.elasticsearch.index.engine.Engine.failEngine(Engine.java:1003)] [elasticsearch[RdCXtk2][generic][T#2]est_thread_info][RdCXtk2] [dfc-traffic_2020_05][3] failed engine [failed to recover from translog]
org.elasticsearch.index.engine.EngineException: failed to recover from translog
at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslogInternal(InternalEngine.java:409) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(InternalEngine.java:380) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(InternalEngine.java:101) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:1333) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:421) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:95) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:301) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:93) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1603) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$4(IndexShard.java:2055) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.4.3.jar:6.4.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: org.elasticsearch.index.shard.IllegalIndexShardStateException: CurrentState[CLOSED] operation only allowed when recovering, origin [LOCAL_TRANSLOG_RECOVERY]
at org.elasticsearch.index.shard.IndexShard.ensureWriteAllowed(IndexShard.java:1480) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:699) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:1272) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.shard.IndexShard.runTranslogRecovery(IndexShard.java:1301) ~[elasticsearch-6.4.3.jar:6.4.3]
at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslogInternal(InternalEngine.java:407) ~[elasticsearch-6.4.3.jar:6.4.3]
... 13 more
2020-05-31T06:24:46,138 [INFO][org.elasticsearch.node.Node.stop(Node.java:785)] [Thread-2est_thread_info][RdCXtk2] stopped
2020-05-31T06:24:46,139 [INFO][org.elasticsearch.node.Node.close(Node.java:803)] [Thread-2est_thread_info][RdCXtk2] closing ...
2020-05-31T06:24:46,136