Before the cluster become red, I was only doing injection over elasticsearch and in the logs apears something like this:
[2015-11-11 00:00:00,279][WARN ][indices.recovery ] [Golem] [new_gompute_history_2015-10-23_10:10:26][1] recovery from [[Smasher][RPYw6RGeTDGxg1g9us422Q][bc10-05][inet[/10.8.5.15:9301]]] failed
org.elasticsearch.transport.RemoteTransportException: [Smasher][inet[/10.8.5.15:9301]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: [new_gompute_history_2015-10-23_10:10:26][1] Phase[1] Execution failed
at org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1151)
at org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:654)
at org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:137)
at org.elasticsearch.indices.recovery.RecoverySource.access$2600(RecoverySource.java:74)
at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: [new_gompute_history_2015-10-23_10:10:26][1] Failed to transfer [0] files with total size of [0b]
at org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:276)
at org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1147)
... 9 more
Caused by: java.nio.file.NoSuchFileException: /tmp/elasticsearch/data/juan/nodes/1/indices/new_gompute_history_2015-10-23_10:10:26/1/index/_a_es090_0.pos
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
at java.nio.channels.FileChannel.open(FileChannel.java:287)
at java.nio.channels.FileChannel.open(FileChannel.java:334)
at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:81)
at org.apache.lucene.store.FileSwitchDirectory.openInput(FileSwitchDirectory.java:172)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:80)
at org.elasticsearch.index.store.DistributorDirectory.openInput(DistributorDirectory.java:130)
at org.elasticsearch.index.store.Store$MetadataSnapshot.checksumFromLuceneFile(Store.java:708)
at org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:613)
at org.elasticsearch.index.store.Store$MetadataSnapshot.<init>(Store.java:596)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:186)
at org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:146)
... 10 more
I didn´t restart the cluster and the shards that give problems appear in this way:
# figure out what shard is the problem
curl localhost:9200/_cat/shards
index shard prirep state docs store ip node
new_gompute_history_2015-10-23_10:10:26 2 p INITIALIZING 10.8.5.15 Alyosha Kravinoff
new_gompute_history_2015-10-23_10:10:26 2 r UNASSIGNED