ECE Platform upgrade from 2.13.2 to 3.3.0 failed and stuck during rollback

I tried to upgrade the ece platform version from 2.13.2 to 3.3 using "elastic-cloud-enterprise.sh" script,The upgrade failed and it got stuck during rollback.I tried to restart frc-runner-runner and now i see all the allocators on the gui are unhealthy,
The upgrader.log looks like this

[2022-12-05 15:15:36,627][INFO ][no.found.upgrade.coordinator.RollbackCoordinator] Updating statuses of local steps to ZooKeeper {}
[2022-12-05 15:15:36,670][ERROR][no.found.upgrade.agents.container.DockerContainerRollbackAgent] Unable to rollback container [constructors-constructor] because the backup container [cons         tructors-constructor_bak] was not found {}
[2022-12-05 15:15:36,737][ERROR][no.found.upgrade.agents.container.DockerContainerRollbackAgent] Unable to rollback container [services-forwarders-services-forwarder] because the backup c         ontainer [services-forwarders-services-forwarder_bak] was not found {}
[2022-12-05 15:15:36,804][ERROR][no.found.upgrade.agents.container.DockerContainerRollbackAgent] Unable to rollback container [client-forwarders-client-forwarder] because the backup conta         iner [client-forwarders-client-forwarder_bak] was not found {}
[2022-12-05 15:15:36,868][ERROR][no.found.upgrade.agents.container.DockerContainerRollbackAgent] Unable to rollback container [zookeeper-servers-zookeeper] because the backup container [z         ookeeper-servers-zookeeper_bak] was not found {}
[2022-12-05 15:15:36,931][ERROR][no.found.upgrade.agents.container.DockerContainerRollbackAgent] Unable to rollback container [directors-director] because the backup container [directors-         director_bak] was not found {}
[2022-12-05 15:15:37,000][ERROR][no.found.upgrade.agents.container.DockerContainerRollbackAgent] Unable to rollback container [runners-runner] because the backup container [runners-runner         _bak] was not found {}
[2022-12-05 15:15:37,003][INFO ][no.found.docker.DockerContainerManager] Starting container [frc-runners-runner] {"ec_container_kind":"docker","ec_container_group":"runners","ec_container         _name":"runner"}
[2022-12-05 15:15:37,492][INFO ][no.found.zookeeper.models.runners.Runner] Removing container [upgraders-upgrader] from runner [tor63secapstgececoordinator01-v] {}
[2022-12-05 15:15:37,590][INFO ][no.found.zookeeper.models.runners.Runner] Container [upgraders-upgrader] was removed from the runner [tor63secapstgececoordinator01-v] {}
[2022-12-05 15:15:37,598][INFO ][org.apache.curator.framework.imps.CuratorFrameworkImpl] backgroundOperationsLoop exiting {}
[2022-12-05 15:15:37,610][WARN ][org.apache.zookeeper.ClientCnxn] An exception was thrown while closing send thread for session 0xc01206e63de4468. {}
org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional data from server sessionid 0xc01206e63de4468, likely server has closed socket
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290)
[2022-12-05 15:15:37,745][INFO ][akka.actor.CoordinatedShutdown] Running CoordinatedShutdown with reason [ActorSystemTerminateReason] {}
[2022-12-05 15:15:37,784][INFO ][no.found.upgrade.BootstrapUpgrade$] ================================================================================ {}
[2022-12-05 15:15:37,785][INFO ][no.found.upgrade.BootstrapUpgrade$]                            UPGRADE SESSION FINISHED {}
[2022-12-05 15:15:37,787][INFO ][no.found.upgrade.BootstrapUpgrade$] ================================================================================ {}
[2022-12-05 15:15:37,790][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:15:42,790][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:15:47,791][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:15:52,791][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:15:57,792][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:02,793][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:07,793][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:12,794][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:17,795][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:22,796][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:27,797][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:32,798][INFO ][no.found.upgrade.BootstrapUpgrade$] Waiting for runner to destroy the applications {}
[2022-12-05 15:16:37,807][INFO ][no.found.util.LogApplicationExit$] Application is exiting {}
[2022-12-05 15:16:49,485][INFO ][no.found.upgrade.BootstrapUpgrade$] ================================================================================ {}
[2022-12-05 15:16:49,500][INFO ][no.found.upgrade.BootstrapUpgrade$]                            UPGRADE SESSION STARTED {}
[2022-12-05 15:16:49,501][INFO ][no.found.upgrade.BootstrapUpgrade$] ================================================================================ {}

I would recommend reaching out to Elastic Support around this.

1 Like

Thanks Christian!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.