However after starting both nodes, when we attempt to re-enable allocation
it seem like Node2 just doesn't work with shard reallocation and throws the
following error. The errors just keep happening so we eventually had to
turn off replication again and now we are getting by with a single node in
the cluster with no replication.
Below is the stacktrace of the NullPointerException when trying to enable
allocation. Any pointers to help with solving this would be greatly
appreciated. I can provide more details if needed.
[WARN ][indices.cluster ] [Valentina Allegra de La Fontaine]
[production_restaurants][2] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[production_restaurants][2]: Recovery failed from
[Shadowmage][Ei4dGmkmScmY0WOPN8PR_Q][server][inet[/server_ip:9300]] int
o [Valentina Allegra de La
Fontaine][LJO_jO59QGuGa-jLmTOZWg][server][inet[/server_ip:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$300(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:175)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[Shadowmage][inet[/172.16.21.21:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[production_restaurants][2] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:996)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:631)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:122)
at
org.elasticsearch.indices.recovery.RecoverySource.access$1600(RecoverySource.java:62)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:351)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:337)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[production_restaurants][2] Failed to transfer [44] files with total size
of [904.2mb]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:243)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:993)
... 9 more
Caused by: java.lang.NullPointerException
However after starting both nodes, when we attempt to re-enable allocation
it seem like Node2 just doesn't work with shard reallocation and throws the
following error. The errors just keep happening so we eventually had to
turn off replication again and now we are getting by with a single node in
the cluster with no replication.
Below is the stacktrace of the NullPointerException when trying to enable
allocation. Any pointers to help with solving this would be greatly
appreciated. I can provide more details if needed.
[WARN ][indices.cluster ] [Valentina Allegra de La Fontaine]
[production_restaurants][2] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[production_restaurants][2]: Recovery failed from
[Shadowmage][Ei4dGmkmScmY0WOPN8PR_Q][server][inet[/server_ip:9300]] int
o [Valentina Allegra de La
Fontaine][LJO_jO59QGuGa-jLmTOZWg][server][inet[/server_ip:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$300(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:175)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[Shadowmage][inet[/172.16.21.21:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[production_restaurants][2] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:996)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:631)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:122)
at
org.elasticsearch.indices.recovery.RecoverySource.access$1600(RecoverySource.java:62)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:351)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:337)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[production_restaurants][2] Failed to transfer [44] files with total size
of [904.2mb]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:243)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:993)
... 9 more
Caused by: java.lang.NullPointerException
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.