On one of the nodes, I am seeing this exception. The status of the overall cluster is still "green", but this specific node, has a slightly slower disk attached to it .. [ a disk mounted from the network .. ].
[2011-07-26 02:34:07,661][WARN ][index.translog ] [Clive] [mbgl][1] failed to flush shard on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: [mbgl][1] Flush failed
at org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:702)
at org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:417)
at org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:160)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3331)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3296)
at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3159)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3232)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3214)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3198)
at org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:699)
... 5 more
Any ideas what is the underlying cause for this ?.. How do we debug this further to nail down the problem .. ?
Cheers, DKN ..