Hello,
In the past couple of days I've been getting a lot of error messages about
corrupted replica shards. The primary shards come up fast after ES process
restart but replicas take a long time to come back. Sometimes it takes a
few node restarts to 'kick' the nodes to start replica shards.
ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a
3-way cluster with 4 logstash feeders hanging off it.
Here are the errors;
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [downloader-2014.08][4] received shard failed for
[downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R],
s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine
failure, message [corrupted preexisting
index][CorruptIndexException[[downloader-2014.08][4] Corrupted index
[corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec
footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520
(resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc"))]]]]
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.06][0] received shard failed for
[eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0]
Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd"))]]]]
[2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.07][0] received shard failed for
[eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0]
Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd"))]]]]
Thanks,
David
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c0af53fb-6fdd-4624-bf6c-9b9d50081689%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hey David, I have same problem now. Have you found a solution for that
problem?
26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı:
Hello,
In the past couple of days I've been getting a lot of error messages about
corrupted replica shards. The primary shards come up fast after ES process
restart but replicas take a long time to come back. Sometimes it takes a
few node restarts to 'kick' the nodes to start replica shards.
ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a
3-way cluster with 4 logstash feeders hanging off it.
Here are the errors;
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [downloader-2014.08][4] received shard failed for
[downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R],
s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine
failure, message [corrupted preexisting
index][CorruptIndexException[[downloader-2014.08][4] Corrupted index
[corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec
footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520
(resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc"))]]]]
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.06][0] received shard failed for
[eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0]
Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd"))]]]]
[2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.07][0] received shard failed for
[eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0]
Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd"))]]]]
Thanks,
David
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04a6e42a-0518-47ef-81a2-b59856a8a309%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hello Mehmet,
For two indices with problematic shards (symptoms: shard is recovering,
recovery stops and recovery is attempted on a different node), I manually
changed replica count to 1 then 2. With a big index (over 90G, I think), I
was never able to recover dual replica set, thankfully it was OK to drop
it. Upgrading to more recent ES version helped too.
HTH,
David
On Saturday, November 29, 2014 2:48:45 AM UTC-8, Mehmet Cem Güntürkün wrote:
Hey David, I have same problem now. Have you found a solution for that
problem?
26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı:
Hello,
In the past couple of days I've been getting a lot of error messages
about corrupted replica shards. The primary shards come up fast after ES
process restart but replicas take a long time to come back. Sometimes it
takes a few node restarts to 'kick' the nodes to start replica shards.
ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a
3-way cluster with 4 logstash feeders hanging off it.
Here are the errors;
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [downloader-2014.08][4] received shard failed for
[downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R],
s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine
failure, message [corrupted preexisting
index][CorruptIndexException[[downloader-2014.08][4] Corrupted index
[corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec
footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520
(resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc"))]]]]
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.06][0] received shard failed for
[eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0]
Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd"))]]]]
[2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.07][0] received shard failed for
[eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0]
Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd"))]]]]
Thanks,
David
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/52c4fa13-32aa-4f60-bda9-c8e999ee0d2d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I've had similar problems. Two things that helped:
- If index had more than one shard then optimizing it to one shard usually
worked.
- In other case manually copying shard files from node with master shard
to one of nodes that kept failing.
On Sunday, 30 November 2014 00:57:02 UTC+1, David Kleiner wrote:
Hello Mehmet,
For two indices with problematic shards (symptoms: shard is recovering,
recovery stops and recovery is attempted on a different node), I manually
changed replica count to 1 then 2. With a big index (over 90G, I think), I
was never able to recover dual replica set, thankfully it was OK to drop
it. Upgrading to more recent ES version helped too.
HTH,
David
On Saturday, November 29, 2014 2:48:45 AM UTC-8, Mehmet Cem Güntürkün
wrote:
Hey David, I have same problem now. Have you found a solution for that
problem?
26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı:
Hello,
In the past couple of days I've been getting a lot of error messages
about corrupted replica shards. The primary shards come up fast after ES
process restart but replicas take a long time to come back. Sometimes it
takes a few node restarts to 'kick' the nodes to start replica shards.
ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a
3-way cluster with 4 logstash feeders hanging off it.
Here are the errors;
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [downloader-2014.08][4] received shard failed for
[downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R],
s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine
failure, message [corrupted preexisting
index][CorruptIndexException[[downloader-2014.08][4] Corrupted index
[corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec
footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520
(resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc"))]]]]
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.06][0] received shard failed for
[eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0]
Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd"))]]]]
[2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.07][0] received shard failed for
[eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0]
Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd"))]]]]
Thanks,
David
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53898508-c45d-4908-a93f-a383941ff61e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Small mistake. 1. should be:
- If shard had more than one segment then optimizing it to one segment
usually worked.
On Sunday, 30 November 2014 12:00:37 UTC+1, Jakub Podeszwik wrote:
I've had similar problems. Two things that helped:
- If index had more than one shard then optimizing it to one shard
usually worked.
- In other case manually copying shard files from node with master shard
to one of nodes that kept failing.
On Sunday, 30 November 2014 00:57:02 UTC+1, David Kleiner wrote:
Hello Mehmet,
For two indices with problematic shards (symptoms: shard is recovering,
recovery stops and recovery is attempted on a different node), I manually
changed replica count to 1 then 2. With a big index (over 90G, I think), I
was never able to recover dual replica set, thankfully it was OK to drop
it. Upgrading to more recent ES version helped too.
HTH,
David
On Saturday, November 29, 2014 2:48:45 AM UTC-8, Mehmet Cem Güntürkün
wrote:
Hey David, I have same problem now. Have you found a solution for that
problem?
26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı:
Hello,
In the past couple of days I've been getting a lot of error messages
about corrupted replica shards. The primary shards come up fast after ES
process restart but replicas take a long time to come back. Sometimes it
takes a few node restarts to 'kick' the nodes to start replica shards.
ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a
3-way cluster with 4 logstash feeders hanging off it.
Here are the errors;
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [downloader-2014.08][4] received shard failed for
[downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R],
s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine
failure, message [corrupted preexisting
index][CorruptIndexException[[downloader-2014.08][4] Corrupted index
[corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec
footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520
(resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc"))]]]]
[2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.06][0] received shard failed for
[eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0]
Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd"))]]]]
[2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 /
Salvador Dali] [eventlog-2014.07][0] received shard failed for
[eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING],
indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message
[corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0]
Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by:
CorruptIndexException[codec footer mismatch: actual footer=0 vs expected
footer=-1071082520 (resource:
NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd"))]]]]
Thanks,
David
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bef48895-f1ec-41d3-9f3c-6009723f103b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.