Elasticsearch is not working after full disk

Hi All,

I am a new user to elasticsearch so, please bare with me for the questions
with obvious answers.
I am running elasticsearch 1.4.2 on a cloud VM with Linux Server release
5.9 (Tikanga). Everything was fine until full disk space issue was hit.
After that I am getting errors related to shards

This is a master node.

ES error log snippet

[2015-04-07 00:00:22,417][DEBUG][action.bulk ] [Node_f0cd] observer:
timeout notification from cluster service. timeout setting [1m], time since
start [1m]
[2015-04-07 00:00:22,417][ERROR][marvel.agent.exporter ] [Node_f0cd] error
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]:
SocketTimeoutException[Read timed out]
[2015-04-07 00:01:32,489][ERROR][marvel.agent.exporter ] [Node_f0cd] error
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]:
SocketTimeoutException[Read timed out]
[2015-04-07 00:01:32,491][DEBUG][action.bulk ] [Node_f0cd] observer:
timeout notification from cluster service. timeout setting [1m], time since
start [1m]
[2015-04-07 00:02:42,561][ERROR][marvel.agent.exporter ] [Node_f0cd] error
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]:
SocketTimeoutException[Read timed out]
[2015-04-07 00:02:42,561][DEBUG][action.bulk ] [Node_f0cd] observer:
timeout notification from cluster service. timeout setting [1m], time since
start [1m]
[2015-04-07 00:03:52,632][DEBUG][action.bulk ] [Node_f0cd] observer:
timeout notification from cluster service. timeout setting [1m], time since
start [1m]

[2015-04-07 00:54:05,769][ERROR][marvel.agent.exporter ] [Node_f0cd] create
failure (index:[.marvel-2015.04.07] type: [node_stats]):
UnavailableShardsException[[.marvel-2015.04.07][0] Primary shard is not
active or isn't assigned is a known node. Timeout: [1m], request:
org.elasticsearch.action.bulk.BulkShardRequest@20ec107a]

[2015-04-07 01:15:07,070][ERROR][marvel.agent.exporter ] [Node_f0cd] error
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]:
SocketTimeoutException[Read timed out]
[2015-04-07 01:15:07,071][DEBUG][action.bulk ] [Node_f0cd] observer:
timeout notification from cluster service. timeout setting [1m], time since
start [1m]
[2015-04-07 01:16:17,145][DEBUG][action.bulk ] [Node_f0cd] observer:
timeout notification from cluster service. timeout setting [1m], time since
start [1m]

[2015-04-07 02:40:15,177][DEBUG][action.search.type ] [Node_f0cd] All
shards failed for phase: [query_fetch]
[2015-04-07 02:40:15,439][DEBUG][action.search.type ] [Node_f0cd] All
shards failed for phase: [query_fetch]
[2015-04-07 02:40:15,485][DEBUG][action.search.type ] [Node_f0cd] All
shards failed for phase: [query_fetch]
[2015-04-07 02:40:15,527][DEBUG][action.search.type ] [Node_f0cd] All
shards failed for phase: [query_fetch]
[2015-04-07 02:40:15,574][DEBUG][action.search.type ] [Node_f0cd] All
shards failed for phase: [query_fetch]
[2015-04-07 02:43:52,567][ERROR][marvel.agent.exporter ] [Node_f0cd] error
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]:
SocketTimeoutException[Read timed out]
[2015-04-07 02:43:52,569][DEBUG][action.bulk ] [Node_f0cd] observer:
timeout notification from cluster service. timeout setting [1m], time since
start [1m]

[2015-04-07 03:33:20,288][WARN ][netty.channel.DefaultChannelPipeline] An
exception was thrown by an exception handler.
java.util.concurrent.RejectedExecutionException: Worker has already been
shutdown
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)

at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)

at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)

at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)

at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)

at
org.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)

at
org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)

at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)

at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781)

Got a solution from Bigfoot in the below article

The solution says "I got it fixed by moving out of the way the
/var/lib/elasticsearch/elasticsearch/nodes/0/indices/logstash-2015.02.16/1/trans‌​log/translog-1424037601837.recovering
but I recon I now lost some events as this file was 40M?"
However, my directory structure is different
( /var/lib/elasticsearch/elasticsearch/nodes/0/_state). Can anybody please
tell what I can do about it? Also, there is another solution given in the
link (probably if nothing
works) http://stackoverflow.com/questions/21157466/all-shards-failed

Thanks for looking into it.
Regards,
Abhishek

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6ce95b47-0991-4a2c-a0fb-f7cfac2a4b5a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.