Hi,
My elasticsearch instance has been crashed for the 3rd time after the
system rebooted and the instance was therefore bounced. I would much
appreciate if someone can help me out on this...
I have tried to increase the number of replicas and update elasticsearch
instance to the newer version 0.19.8, but none of them actually worked.
Every time elasticsearch crashed, the pattern was pretty similar ( listed
as follows). It will keep rolling suggesting almost all indices have this
issue and create a log of monster size. For your information, I indexed 30
indices with size around ~25GB to a elasticsearch cluster with 2 nodes, 5
shards each, and 2 replicas each.
WARNING: [dev-bry200163108d] [coverage-elastic1346994255418][0] failed to
start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[coverage-elastic1346994255418][0] failed recovery
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:228)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException:
[coverage-elastic1346994255418][0] Failed to open reader on writer
at
org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:286)
at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryPrepareForTranslog(InternalIndexShard.java:579)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:188)
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:177)
... 3 more
I do notice that if I disable flush on translog (index.translog.*
disable_flush=true*), the cluster would be fine even it is killed due to
system reboot. If I can guarantee a flush operation is executed for all
writing/updating index operations, can I safely just disable the translog
flush forever?
--