ES shard failed error

Hi,

I run ES with default setting which takes 5 shards. And create index
continually for a night. This morning I found that the disk is full and
there are disk full exception in elasticsearch.log:
[2013-07-23 09:50:47,776][WARN ][index.merge.scheduler ] [Madam Slay]
[20130723][0] failed to merge
java.io.IOException: no space on device
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:499)
at
org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:474)
at
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput.flushBuffer(RateLimitedFSDirectory.java:186)
at
org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:78)
at
org.elasticsearch.common.lucene.store.BufferedChecksumIndexOutput.flushBuffer(BufferedChecksumIndexOutput.java:65)
at
org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:113)
at
org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:102)

[2013-07-23 09:50:49,480][WARN ][index.shard.service ] [Madam Slay]
[20130723][0] Failed to perform scheduled engine refresh
org.elasticsearch.index.engine.RefreshFailedEngineException: [20130723][0]
Refresh failed
at
org.elasticsearch.index.engine.robin.RobinEngine.refresh(RobinEngine.java:796)
at
org.elasticsearch.index.shard.service.InternalIndexShard.refresh(InternalIndexShard.java:412)
at
org.elasticsearch.index.shard.service.InternalIndexShard$EngineRefresher$1.run(InternalIndexShard.java:755)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: no space on device

[2013-07-23 09:50:52,646][WARN ][cluster.action.shard ] [Madam Slay]
sending failed shard for [20130723][0], node[TjhCZBlpQu-OwOm-8CzFHA], [P],
s[STARTED], reason [engine failure, message
[MergeException[java.io.IOException: no space on device]; nested:
IOException[ no space on device]; ]]
[2013-07-23 09:50:52,646][WARN ][cluster.action.shard ] [Madam Slay]
received shard failed for [20130723][0], node[TjhCZBlpQu-OwOm-8CzFHA], [P],
s[STARTED], reason [engine failure, message
[MergeException[java.io.IOException: no space on device]; nested:
IOException[ no space on device]; ]]
[2013-07-23 10:02:02,902][WARN ][index.shard.service ] [Madam Slay]
[20130723][2] Failed to perform scheduled engine refresh
org.elasticsearch.index.engine.RefreshFailedEngineException: [20130723][2]
Refresh failed
at
org.elasticsearch.index.engine.robin.RobinEngine.refresh(RobinEngine.java:796)
at
org.elasticsearch.index.shard.service.InternalIndexShard.refresh(InternalIndexShard.java:412)
at
org.elasticsearch.index.shard.service.InternalIndexShard$EngineRefresher$1.run(InternalIndexShard.java:755)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: no space on device

Also when I try to search something in ES, it returns shard 0 is failed.
reason BroadcastShardOperationFaailedException[[20130723][0] No active
shard(s)].
Is it caused by the disk full? why other 4 shards works fine? how can I
bring this shard back to work?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

there is an error message in your exception telling you 'no space on
device' - if you run out of disk space, bad things can happen... for
elasticsearch this means to fail the shard and go somewhere else with the
data (if you run a single node, this is not going to work).

The other four shards most likely still work, because you did not yet try
to index data into them, otherwise they would fail as well.

--Alex

On Tue, Jul 23, 2013 at 5:15 AM, lijionly@gmail.com wrote:

Hi,

I run ES with default setting which takes 5 shards. And create index
continually for a night. This morning I found that the disk is full and
there are disk full exception in elasticsearch.log:
[2013-07-23 09:50:47,776][WARN ][index.merge.scheduler ] [Madam Slay]
[20130723][0] failed to merge
java.io.IOException: no space on device
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:499)
at
org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:474)
at
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput.flushBuffer(RateLimitedFSDirectory.java:186)
at
org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:78)
at
org.elasticsearch.common.lucene.store.BufferedChecksumIndexOutput.flushBuffer(BufferedChecksumIndexOutput.java:65)
at
org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:113)
at
org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:102)

[2013-07-23 09:50:49,480][WARN ][index.shard.service ] [Madam Slay]
[20130723][0] Failed to perform scheduled engine refresh
org.elasticsearch.index.engine.RefreshFailedEngineException: [20130723][0]
Refresh failed
at
org.elasticsearch.index.engine.robin.RobinEngine.refresh(RobinEngine.java:796)
at
org.elasticsearch.index.shard.service.InternalIndexShard.refresh(InternalIndexShard.java:412)
at
org.elasticsearch.index.shard.service.InternalIndexShard$EngineRefresher$1.run(InternalIndexShard.java:755)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: no space on device

[2013-07-23 09:50:52,646][WARN ][cluster.action.shard ] [Madam Slay]
sending failed shard for [20130723][0], node[TjhCZBlpQu-OwOm-8CzFHA], [P],
s[STARTED], reason [engine failure, message
[MergeException[java.io.IOException: no space on device]; nested:
IOException[ no space on device]; ]]
[2013-07-23 09:50:52,646][WARN ][cluster.action.shard ] [Madam Slay]
received shard failed for [20130723][0], node[TjhCZBlpQu-OwOm-8CzFHA], [P],
s[STARTED], reason [engine failure, message
[MergeException[java.io.IOException: no space on device]; nested:
IOException[ no space on device]; ]]
[2013-07-23 10:02:02,902][WARN ][index.shard.service ] [Madam Slay]
[20130723][2] Failed to perform scheduled engine refresh
org.elasticsearch.index.engine.RefreshFailedEngineException: [20130723][2]
Refresh failed
at
org.elasticsearch.index.engine.robin.RobinEngine.refresh(RobinEngine.java:796)
at
org.elasticsearch.index.shard.service.InternalIndexShard.refresh(InternalIndexShard.java:412)
at
org.elasticsearch.index.shard.service.InternalIndexShard$EngineRefresher$1.run(InternalIndexShard.java:755)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: no space on device

Also when I try to search something in ES, it returns shard 0 is failed.
reason BroadcastShardOperationFaailedException[[20130723][0] No active
shard(s)].
Is it caused by the disk full? why other 4 shards works fine? how can I
bring this shard back to work?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.