RecoveryFailedException while adding a node during Elasticsearch upgrade from 0.2.0 to 1.2.1

anurag_naidu · June 24, 2014, 11:14pm

We are upgrading our ES server from 0.2.0 to 1.2.1 using the cluster
restart upgrade steps here
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html#restart-upgrade.
We have 2 nodes as part of our cluster. We disabled allocation on both
nodes before the upgrade and then upgraded the servers.

However after starting both nodes, when we attempt to re-enable allocation
it seem like Node2 just doesn't work with shard reallocation and throws the
following error. The errors just keep happening so we eventually had to
turn off replication again and now we are getting by with a single node in
the cluster with no replication.

Below is the stacktrace of the NullPointerException when trying to enable
allocation. Any pointers to help with solving this would be greatly
appreciated. I can provide more details if needed.

[WARN ][indices.cluster ] [Valentina Allegra de La Fontaine]
[production_restaurants][2] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[production_restaurants][2]: Recovery failed from
[Shadowmage][Ei4dGmkmScmY0WOPN8PR_Q][server][inet[/server_ip:9300]] int
o [Valentina Allegra de La
Fontaine][LJO_jO59QGuGa-jLmTOZWg][server][inet[/server_ip:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$300(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:175)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[Shadowmage][inet[/172.16.21.21:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[production_restaurants][2] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:996)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:631)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:122)
at
org.elasticsearch.indices.recovery.RecoverySource.access$1600(RecoverySource.java:62)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:351)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:337)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[production_restaurants][2] Failed to transfer [44] files with total size
of [904.2mb]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:243)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:993)
... 9 more
Caused by: java.lang.NullPointerException

Thanks
-anurag

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1788dbdf-7e0c-4f7d-8382-01ad47a15cb7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

spinscale · June 30, 2014, 7:20am

Hey,

can you check your log files and have access to to the full stack trace of
the NullPointerException in order to find out what is going on? Thanks!

--Alex

On Wed, Jun 25, 2014 at 1:14 AM, anurag naidu anuragnaidu@gmail.com wrote:

We are upgrading our ES server from 0.2.0 to 1.2.1 using the cluster
restart upgrade steps here
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html#restart-upgrade.
We have 2 nodes as part of our cluster. We disabled allocation on both
nodes before the upgrade and then upgraded the servers.

However after starting both nodes, when we attempt to re-enable allocation
it seem like Node2 just doesn't work with shard reallocation and throws the
following error. The errors just keep happening so we eventually had to
turn off replication again and now we are getting by with a single node in
the cluster with no replication.

Below is the stacktrace of the NullPointerException when trying to enable
allocation. Any pointers to help with solving this would be greatly
appreciated. I can provide more details if needed.

[WARN ][indices.cluster ] [Valentina Allegra de La Fontaine]
[production_restaurants][2] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[production_restaurants][2]: Recovery failed from
[Shadowmage][Ei4dGmkmScmY0WOPN8PR_Q][server][inet[/server_ip:9300]] int
o [Valentina Allegra de La
Fontaine][LJO_jO59QGuGa-jLmTOZWg][server][inet[/server_ip:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$300(RecoveryTarget.java:65)
at
org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:175)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException:
[Shadowmage][inet[/172.16.21.21:9300]][index/shard/recovery/startRecovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException:
[production_restaurants][2] Phase[1] Execution failed
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:996)
at
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:631)
at
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:122)
at
org.elasticsearch.indices.recovery.RecoverySource.access$1600(RecoverySource.java:62)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:351)
at
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:337)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by:
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException:
[production_restaurants][2] Failed to transfer [44] files with total size
of [904.2mb]
at
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:243)
at
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:993)
... 9 more
Caused by: java.lang.NullPointerException

Thanks
-anurag

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1788dbdf-7e0c-4f7d-8382-01ad47a15cb7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1788dbdf-7e0c-4f7d-8382-01ad47a15cb7%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM81i8JFJ_F5i%2BfFoKO-ExkSfwu-k-qHE2JaTjCBMYc9hQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Cluster Reroute Retry Failed Null Pointer Exception Elasticsearch	1	487	October 24, 2019
RecoveryFailedException after a node restart Elasticsearch	2	1873	July 5, 2017
Unassigned replica shards after cluster recovery Elasticsearch	2	1243	July 5, 2017
Shard failed recovery upon rolling upgrade due to storeTermVector error Elasticsearch	1	371	April 6, 2022
Elasticseach failed shard allocation Elasticsearch	8	1353	May 28, 2021

RecoveryFailedException while adding a node during Elasticsearch upgrade from 0.2.0 to 1.2.1

Related topics