Hello,
I have updated ElasticSearch from ver 0.90.3 to ver 1.3.4 ( OS - Debian
Wheezy, deb package version ).
This is a cluster configuration, with 3 nodes connected to unicast.
Update was done with ElasticSearch switched off.
Afters start new verion ElasticSearch cluster health is in 'yellow' state
(showed by head plugin)
( and red state - showed by curl / _cluster / health ).
3 indexes in cluster has 3 unnassigned shards.
Logs from all nodes are lot of informations of "corrupted indexes" or
"sending failed shard for"
Does update to ver 1.4.2 should fix the problem? (Due to lucene libraries
LUCENE-5975 )
Removing index and rereading it is a last thing to do.
ES state from first node:
curl -XGET 'http://127.0.0.1:9200/_cluster/health?pretty=true'
{
"cluster_name" : "searchcass",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 283,
"active_shards" : 576,
"relocating_shards" : 0,
"initializing_shards" : 3,
"unassigned_shards" : 3
}
How can I fix it? Please reply.
Regards
Grzesiek
ES log from node 1 (search01):
...
[2014-12-17 11:04:20,176][WARN ][cluster.action.shard ] [search01]
[201205][0] received shard failed for [201205][0],
node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [na],
reason [master
[search01][HYtX23nPS7uU-DeY-zF6AA][search01][inet[/192.168.199.211:9300]]
marked shard as initializing, but shard is marked as failed, resend shard
failure]
[2014-12-17 11:04:20,253][WARN ][indices.cluster ] [search01]
[201301][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[201301][0] failed to fetch index version after copying it over
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152)
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.lucene.index.CorruptIndexException: [201301][0]
Corrupted index [corrupted_cFQBoZ-WTK2sW8mgUUv1vw] caused by:
CorruptIndexException[did not read all bytes from file: read 9650 vs size
9651 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201301/0/index/_5f9v_k.del")))]
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353)
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119)
... 4 more
[2014-12-17 11:04:20,279][WARN ][cluster.action.shard ] [search01]
[201304][4] received shard failed for [201304][4],
node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201304][4] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201304][4]
Corrupted index [corrupted_7hrGiX_jTx2KLbQUIAiLpg] caused by:
CorruptIndexException[did not read all bytes from file: read 295641 vs size
295642 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]];
]]
[2014-12-17 11:04:20,305][WARN ][cluster.action.shard ] [search01]
[201304][4] received shard failed for [201304][4],
node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [na],
reason [master
[search01][HYtX23nPS7uU-DeY-zF6AA][search01][inet[/192.168.199.211:9300]]
marked shard as initializing, but shard is marked as failed, resend shard
failure]
[2014-12-17 11:04:20,329][WARN ][cluster.action.shard ] [search01]
[201301][0] sending failed shard for [201301][0],
node[HYtX23nPS7uU-DeY-zF6AA], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201301][0] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201301][0]
Corrupted index [corrupted_cFQBoZ-WTK2sW8mgUUv1vw] caused by:
CorruptIndexException[did not read all bytes from file: read 9650 vs size
9651 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcassandra/nodes/0/indices/201301/0/index/_5f9v_k.del")))]];
]]
[2014-12-17 11:04:20,329][WARN ][cluster.action.shard ] [search01]
[201301][0] received shard failed for [201301][0],
node[HYtX23nPS7uU-DeY-zF6AA], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201301][0] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201301][0]
Corrupted index [corrupted_cFQBoZ-WTK2sW8mgUUv1vw] caused by:
CorruptIndexException[did not read all bytes from file: read 9650 vs size
9651 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201301/0/index/_5f9v_k.del")))]];
]]
[2014-12-17 11:04:20,331][WARN ][cluster.action.shard ] [search01]
[201301][0] received shard failed for [201301][0],
node[HYtX23nPS7uU-DeY-zF6AA], [P], s[INITIALIZING], indexUUID [na],
reason [master
[search01][HYtX23nPS7uU-DeY-zF6AA][search01][inet[/192.168.199.211:9300]]
marked shard as initializing, but shard is marked as failed, resend shard
failure]
...
ES log from node 2 (search02):
[2014-12-17 11:10:11,971][WARN ][cluster.action.shard ] [search02]
[201301][0] sending failed shard for [201301][0],
node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201301][0] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201301][0]
Corrupted index [corrupted_U1eBtw3YRYKcfuV9ZHPadw] caused by:
CorruptIndexException[did not read all bytes from file: read 9650 vs size
9651 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201301/0/index/_5f9v_k.del")))]];
]]
[2014-12-17 11:10:12,258][WARN ][indices.cluster ] [search02]
[201205][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[201205][0] failed to fetch index version after copying it over
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152)
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.lucene.index.CorruptIndexException: [201205][0]
Corrupted index [corrupted_xCs6wOMpR-G3pbQfUpn-Ww] caused by:
CorruptIndexException[did not read all bytes from file: read 205 vs size
206 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353)
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119)
... 4 more
[2014-12-17 11:10:12,278][WARN ][indices.cluster ] [search02]
[201304][4] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[201304][4] failed to fetch index version after copying it over
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152)
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.lucene.index.CorruptIndexException: [201304][4]
Corrupted index [corrupted_mfMa6wjdT1m6QZ6WUBHKrA] caused by:
CorruptIndexException[did not read all bytes from file: read 295641 vs size
295642 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353)
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119)
... 4 more
[2014-12-17 11:10:12,282][WARN ][cluster.action.shard ] [search02]
[201205][0] sending failed shard for [201205][0],
node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201205][0] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201205][0]
Corrupted index [corrupted_xCs6wOMpR-G3pbQfUpn-Ww] caused by:
CorruptIndexException[did not read all bytes from file: read 205 vs size
206 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]];
]]
[2014-12-17 11:10:12,297][WARN ][cluster.action.shard ] [search02]
[201304][4] sending failed shard for [201304][4],
node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201304][4] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201304][4]
Corrupted index [corrupted_mfMa6wjdT1m6QZ6WUBHKrA] caused by:
CorruptIndexException[did not read all bytes from file: read 295641 vs size
295642 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]];
]]
ES log from node 3 (search03):
2014-12-17 11:13:49,541][WARN ][cluster.action.shard ] [search03]
[201205][0] sending failed shard for [201205][0],
node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201205][0] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201205][0]
Corrupted index [corrupted_weSqXhW_T9Wle8wEHhEnXw] caused by:
CorruptIndexException[did not read all bytes from file: read 205 vs size
206 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]];
]]
[2014-12-17 11:13:49,581][WARN ][indices.cluster ] [search03]
[201304][4] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[201304][4] failed to fetch index version after copying it over
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152)
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.lucene.index.CorruptIndexException: [201304][4]
Corrupted index [corrupted_7hrGiX_jTx2KLbQUIAiLpg] caused by:
CorruptIndexException[did not read all bytes from file: read 295641 vs size
295642 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353)
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119)
... 4 more
[2014-12-17 11:13:49,651][WARN ][cluster.action.shard ] [search03]
[201304][4] sending failed shard for [201304][4],
node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201304][4] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201304][4]
Corrupted index [corrupted_7hrGiX_jTx2KLbQUIAiLpg] caused by:
CorruptIndexException[did not read all bytes from file: read 295641 vs size
295642 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]];
]]
[2014-12-17 11:13:49,747][WARN ][indices.cluster ] [search03]
[201205][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[201205][0] failed to fetch index version after copying it over
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152)
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.lucene.index.CorruptIndexException: [201205][0]
Corrupted index [corrupted_weSqXhW_T9Wle8wEHhEnXw] caused by:
CorruptIndexException[did not read all bytes from file: read 205 vs size
206 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353)
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119)
... 4 more
[2014-12-17 11:13:49,823][WARN ][cluster.action.shard ] [search03]
[201205][0] sending failed shard for [201205][0],
node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [na],
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[201205][0] failed to fetch index
version after copying it over]; nested: CorruptIndexException[[201205][0]
Corrupted index [corrupted_weSqXhW_T9Wle8wEHhEnXw] caused by:
CorruptIndexException[did not read all bytes from file: read 205 vs size
206 (resource:
BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]];
]]
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/746145b6-dd27-468c-af1e-50b4685b1a38%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.