Couple nodes ran out space on their disks that stored the ES shards. After
shutting down the nodes, clearing some space and starting them back up, the
log keeps filling with:
[2012-05-24 07:40:38,952][WARN ][cluster.action.shard ] [Firearm]
received shard failed for [threads][4], node[5sT1SgOxR16I92QKymk9yw], [P],
s[INITIALIZING], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[threads][4] failed recovery]; nested:
NumberFormatException[For input string: "3hr.1337807144597"]; ]]
The [4] (which I assume is the shard number) changes for all shards 0-5 and
so does the input string.
ES 0.19.4
Java 1.6.0_31
3 nodes - 1 index (6 shards - 1 replica)
There should be another log message on the other node with more details,
can you gist it?
On Thu, May 24, 2012 at 9:58 AM, John Watson john@disqus.com wrote:
Couple nodes ran out space on their disks that stored the ES shards. After
shutting down the nodes, clearing some space and starting them back up, the
log keeps filling with:
[2012-05-24 07:40:38,952][WARN ][cluster.action.shard ] [Firearm]
received shard failed for [threads][4], node[5sT1SgOxR16I92QKymk9yw], [P],
s[INITIALIZING], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[threads][4] failed recovery]; nested:
NumberFormatException[For input string: "3hr.1337807144597"]; ]]
The [4] (which I assume is the shard number) changes for all shards 0-5
and so does the input string.
ES 0.19.4
Java 1.6.0_31
3 nodes - 1 index (6 shards - 1 replica)
I was able to recover my cluster by moving the _state/state- file to
the replica shards since the cluster seem to always wait for the primary
shards to attempt recovery.
On Friday, May 25, 2012 3:32:08 PM UTC-7, kimchy wrote:
There should be another log message on the other node with more details,
can you gist it?
On Thu, May 24, 2012 at 9:58 AM, John Watson john@disqus.com wrote:
Couple nodes ran out space on their disks that stored the ES shards.
After shutting down the nodes, clearing some space and starting them back
up, the log keeps filling with:
[2012-05-24 07:40:38,952][WARN ][cluster.action.shard ] [Firearm]
received shard failed for [threads][4], node[5sT1SgOxR16I92QKymk9yw], [P],
s[INITIALIZING], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[threads][4] failed recovery]; nested:
NumberFormatException[For input string: "3hr.1337807144597"]; ]]
The [4] (which I assume is the shard number) changes for all shards 0-5
and so does the input string.
ES 0.19.4
Java 1.6.0_31
3 nodes - 1 index (6 shards - 1 replica)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.