Failed Shard Recovery

Mario_Rodriguez · February 10, 2015, 8:55pm

I have 7 indices in our ES cluster, and one of the indices has an issue
that is preventing recovery.

Here is what I am seeing in the log:

[2015-02-10 00:00:02,483][WARN ][indices.recovery ] [ES_Server1]
[prodcustomer][1] recovery from
[[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]]
failed
org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more
[2015-02-10 00:00:02,483][WARN ][indices.cluster ] [ES_Server1]
[prodcustomer][1] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[prodcustomer][1]: Recovery failed from
[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]] into
[ES_Server1][R3ArnIZsSVSsd-VLEaU_Ug][ES_Server1][inet[ES_Server1.verify.local/4.4.4.4:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$200(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$3.run(RecoveryTarget.java:184)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more

Is it as simple as deleting the checksum files?
Thank you for any insight anyone can provide.

Mario

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c11de11-2d3b-4ed3-a173-bae4b50edf8c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mario_Rodriguez · February 10, 2015, 9:03pm

We are using version 1.3.1

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d9b842a9-33ed-41f3-b8a3-a02f75e756d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mario_Rodriguez · February 11, 2015, 6:27pm

Bumping up.
Does anyone have insight into this??

Mario

On Tuesday, February 10, 2015 at 3:02:12 PM UTC-6, Mario Rodriguez wrote:

On Tuesday, February 10, 2015 at 2:55:59 PM UTC-6, Mario Rodriguez wrote:

I have 7 indices in our ES cluster, and one of the indices has an issue
that is preventing recovery.

Here is what I am seeing in the log:

[2015-02-10 00:00:02,483][WARN ][indices.recovery ] [ES_Server1]
[prodcustomer][1] recovery from
[[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]]
failed
org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more
[2015-02-10 00:00:02,483][WARN ][indices.cluster ] [ES_Server1]
[prodcustomer][1] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[prodcustomer][1]: Recovery failed from
[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]
into
[ES_Server1][R3ArnIZsSVSsd-VLEaU_Ug][ES_Server1][inet[ES_Server1.verify.local/
4.4.4.4:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$200(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$3.run(RecoveryTarget.java:184)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more

Is it as simple as deleting the checksum files?
Thank you for any insight anyone can provide.

*EDIT We are on version 1.3.1

Mario

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b527f9d-8d3a-4fce-a5b5-b030c7ab8994%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

zivsegal · February 17, 2015, 10:15am

+1

I have the same issue. working with 1.4.1

This is my stack trace:
[2015-02-17 09:36:08,438][WARN ][indices.recovery ] [data_node]
[deragzicaqwarkgqdotwqecptkfibn-150216][1] recovery from
[[data_node2][PC-hkqAnSQuPjzFjykEsSg][ip-X-X-X-X][inet[/x.x.x.x:9300]]{max_local_storage_nodes=1,
zone=us-east-1d, master=false}] failed
org.elasticsearch.transport.RemoteTransportException:
[data_node2][inet[/x.x.x.x:9300]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[deragzicaqwarkgqdotwqecptkfibn-150216][1] Phase[2] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1136)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:654)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:137)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2600(RecoverySource.java:74)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:464)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:450)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[data_node][inet[/x.x.x.x:9300]][internal:index/shard/recovery/translog_ops]
Caused by: org.elasticsearch.index.mapper.MapperParsingException: object
mapping for [json] tried to parse as object, but got EOF, has a concrete
value been provided to it?
at
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:498)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:541)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:490)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:392)
at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:775)
at
org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.java:433)
at
org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.

On Wednesday, February 11, 2015 at 8:27:47 PM UTC+2, Mario Rodriguez wrote:

Bumping up.
Does anyone have insight into this??

Mario

On Tuesday, February 10, 2015 at 3:02:12 PM UTC-6, Mario Rodriguez wrote:

On Tuesday, February 10, 2015 at 2:55:59 PM UTC-6, Mario Rodriguez wrote:

I have 7 indices in our ES cluster, and one of the indices has an issue
that is preventing recovery.

Here is what I am seeing in the log:

[2015-02-10 00:00:02,483][WARN ][indices.recovery ] [ES_Server1]
[prodcustomer][1] recovery from
[[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]]
failed
org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more
[2015-02-10 00:00:02,483][WARN ][indices.cluster ] [ES_Server1]
[prodcustomer][1] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[prodcustomer][1]: Recovery failed from
[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]
into
[ES_Server1][R3ArnIZsSVSsd-VLEaU_Ug][ES_Server1][inet[ES_Server1.verify.local/
4.4.4.4:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$200(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$3.run(RecoveryTarget.java:184)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more

Is it as simple as deleting the checksum files?
Thank you for any insight anyone can provide.

*EDIT We are on version 1.3.1

Mario

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5aef93a0-8578-42a4-98b1-aec6580f529f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

zivsegal · February 17, 2015, 11:37am

BTW, following the suggestion
from Redirecting to Google Groups

I was able to work around this issue by setting number_of_replicas to 0,
and than after verifying that all the replicas were gone, increasing it
back to 1. the new replicas came up well without any issues.

On Tuesday, February 17, 2015 at 12:15:12 PM UTC+2, Ziv Segal wrote:

+1

I have the same issue. working with 1.4.1

This is my stack trace:
[2015-02-17 09:36:08,438][WARN ][indices.recovery ] [data_node]
[deragzicaqwarkgqdotwqecptkfibn-150216][1] recovery from
[[data_node2][PC-hkqAnSQuPjzFjykEsSg][ip-X-X-X-X][inet[/x.x.x.x:9300]]{max_local_storage_nodes=1,
zone=us-east-1d, master=false}] failed
org.elasticsearch.transport.RemoteTransportException:
[data_node2][inet[/x.x.x.x:9300]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[deragzicaqwarkgqdotwqecptkfibn-150216][1] Phase[2] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1136)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:654)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:137)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2600(RecoverySource.java:74)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:464)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:450)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[data_node][inet[/x.x.x.x:9300]][internal:index/shard/recovery/translog_ops]
Caused by: org.elasticsearch.index.mapper.MapperParsingException: object
mapping for [json] tried to parse as object, but got EOF, has a concrete
value been provided to it?
at
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:498)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:541)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:490)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:392)
at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:775)
at
org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.java:433)
at
org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.

On Wednesday, February 11, 2015 at 8:27:47 PM UTC+2, Mario Rodriguez wrote:

Bumping up.
Does anyone have insight into this??

Mario

On Tuesday, February 10, 2015 at 3:02:12 PM UTC-6, Mario Rodriguez wrote:

On Tuesday, February 10, 2015 at 2:55:59 PM UTC-6, Mario Rodriguez wrote:

I have 7 indices in our ES cluster, and one of the indices has an issue
that is preventing recovery.

Here is what I am seeing in the log:

[2015-02-10 00:00:02,483][WARN ][indices.recovery ]
[ES_Server1] [prodcustomer][1] recovery from
[[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]]
failed
org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more
[2015-02-10 00:00:02,483][WARN ][indices.cluster ]
[ES_Server1] [prodcustomer][1] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[prodcustomer][1]: Recovery failed from
[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]
into
[ES_Server1][R3ArnIZsSVSsd-VLEaU_Ug][ES_Server1][inet[ES_Server1.verify.local/
4.4.4.4:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$200(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$3.run(RecoveryTarget.java:184)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[prodcustomer][1] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
at
org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[prodcustomer][1] Failed to transfer [0] files with total size of [0b]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
... 9 more
Caused by: java.io.EOFException: read past EOF:
MMapIndexInput(path="D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index_checksums-1418875637019")
at
org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
at
org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
at
org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
... 10 more

Is it as simple as deleting the checksum files?
Thank you for any insight anyone can provide.

*EDIT We are on version 1.3.1

Mario

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bd771361-be04-4c98-b848-a4322e519f76%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Corrupted Shard on Recovery Elasticsearch	10	722	July 6, 2017
Deleting Corrupt Checksum Files Elasticsearch	2	368	July 6, 2017
Indexing/shard failure Elasticsearch	5	998	July 6, 2017
Index shard got corrupted Elasticsearch	3	3149	July 6, 2017
ES failed to recover after crash Elasticsearch	9	3610	July 6, 2017

Failed Shard Recovery

Related topics