Remove unhealthy ECE node

UPPERCASE · October 6, 2022, 2:38pm

Going through the ECE tutorial in the Elastic Cloud itself. Due to a node failure I had to reinstall a node, which was part of a cluster. Now when I want to reinstall the ECE setup, I run into this error. Which is likely due to the fact that the IP is still registered.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elastic Cloud Enterprise Installer 

Install Elastic Cloud Enterprise on this host to add its resources to an existing installation. 
After installation is complete, the host becomes a runner that you can assign a role to in the Cloud UI. 
To learn more about the options you can specify, see the documentation.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 

-- Verifying Prerequisites --
Checking runner container does not exist... PASSED
Checking host storage root volume path is not root... PASSED
Checking host storage path is accessible... PASSED
Checking host storage path contents matches whitelist... PASSED
Checking Docker version... PASSED
Checking Docker SELinux support... PASSED
Checking Docker file system... PASSED
Checking Docker network settings... PASSED
Checking Docker storage driver... PASSED
Checking whether 'setuser' works inside a Docker container... PASSED
Checking memory settings... PASSED
 - Option `--memory-settings` not used. Default memory settings might be insufficient for production use!
Checking runner ip connectivity... PASSED
Checking coordinator connectivity... PASSED
Checking OS IPv4 IP forward setting... PASSED
Checking metadata endpoint protection... PASSED
 - The installation can proceed with 169.254.169.254 accessible; however we consider this a SSRF risk and recommend adjusting your firewall rules to prevent access from docker containers.
Checking OS max map count setting... PASSED
Checking OS kernel version... PASSED
Checking minimum required memory... PASSED
Checking OS kernel cgroup.memory... PASSED
Checking OS minimum ephemeral port... PASSED
Checking OS max open file descriptors per process... PASSED
Checking OS max open file descriptors system-wide... PASSED
Checking OS file system and Docker storage driver compatibility... PASSED
Checking OS file system storage driver permissions... PASSED
Checking OS AppArmor status... PASSED
Checking OS SELinux status... PASSED
-- Completed Verifying Prerequisites -- 

- Running Bootstrap container
- Monitoring bootstrap process
- Loaded bootstrap settings for additional host {}
- Validating roles token {}
- Core services started. {}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Errors have caused Elastic Cloud Enterprise installation to fail - Please check logs 
  Node type - additional
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In the logs it says:

[2022-10-06 14:34:26,164][INFO ][no.found.docker.DockerContainerManager] Deleting container [frc-client-forwarders-client-forwarder] {"ec_container_kind":"docker","ec_container_group":"client-forwarders","ec_container_name":"client-forwarder"}
[2022-10-06 14:34:26,194][INFO ][no.found.bootstrap.containers.ClientForwarderContainerBootstrap] Removing container directory. [/mnt/data/elastic/172.31.30.9/services/client-forwarder] {}
[2022-10-06 14:34:26,202][ERROR][no.found.bootstrap.BootstrapAdditional$] Unhandled error. {}
java.lang.AssertionError: assertion failed: This runner ID [172.31.30.9] is already in use. To fix, either provide a different runner ID, or go into the Cloud UI and delete the existing runner.
  at scala.Predef$.assert(Predef.scala:223)
  at no.found.bootstrap.BootstrapAdditional.bootstrap(BootstrapAdditional.scala:223)
  at no.found.bootstrap.BootstrapAdditional$.delayedEndpoint$no$found$bootstrap$BootstrapAdditional$1(BootstrapAdditional.scala:381)
  at no.found.bootstrap.BootstrapAdditional$delayedInit$body.apply(BootstrapAdditional.scala:376)
  at scala.Function0.apply$mcV$sp(Function0.scala:39)
  at scala.Function0.apply$mcV$sp$(Function0.scala:39)
  at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
  at scala.App.$anonfun$main$1$adapted(App.scala:80)
  at scala.collection.immutable.List.foreach(List.scala:431)
  at scala.App.main(App.scala:80)
  at scala.App.main$(App.scala:78)
  at no.found.util.ElasticCloudApp.main(ElasticCloudApp.scala:23)
  at no.found.bootstrap.BootstrapAdditional.main(BootstrapAdditional.scala)
[2022-10-06 14:34:26,380][ERROR][scala.Predef$            ] Uncaught throwable occurred on thread: [main], calling System.exit(1) {}
java.lang.AssertionError: assertion failed: This runner ID [172.31.30.9] is already in use. To fix, either provide a different runner ID, or go into the Cloud UI and delete the existing runner.
  at scala.Predef$.assert(Predef.scala:223)
  at no.found.bootstrap.BootstrapAdditional.bootstrap(BootstrapAdditional.scala:223)
  at no.found.bootstrap.BootstrapAdditional$.delayedEndpoint$no$found$bootstrap$BootstrapAdditional$1(BootstrapAdditional.scala:381)
  at no.found.bootstrap.BootstrapAdditional$delayedInit$body.apply(BootstrapAdditional.scala:376)
  at scala.Function0.apply$mcV$sp(Function0.scala:39)
  at scala.Function0.apply$mcV$sp$(Function0.scala:39)
  at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
  at scala.App.$anonfun$main$1$adapted(App.scala:80)
  at scala.collection.immutable.List.foreach(List.scala:431)
  at scala.App.main(App.scala:80)
  at scala.App.main$(App.scala:78)
  at no.found.util.ElasticCloudApp.main(ElasticCloudApp.scala:23)
  at no.found.bootstrap.BootstrapAdditional.main(BootstrapAdditional.scala)
[2022-10-06 14:34:26,389][INFO ][no.found.util.LogApplicationExit$] Application is exiting {}

And in the dashboard it shows no node to remove.

There must be a way to remove this node. But can't find out how, any suggestions?

UPPERCASE · October 7, 2022, 12:55pm

Maybe I should rephrase the question, how do I delete this host? All these layers of convenience are nice, but this hides a bit too much and makes it unclear for me what to do.

Screenshot 2022-10-07 at 14-54-04 Allocator — Elastic Cloud Enterprise

For search engines

You can delete a host if you are de-provisioning an instance
Disconnect this host before deleting it

Not sure if it was the best method. Put the allocator in maintenance, removed all roles. Then simply removed all containers. Then I was able to remove the unhealthy node from the web dashboard.

system · October 21, 2022, 12:56pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ECE reinstall error Elastic Cloud Enterprise (ECE)	4	1001	August 1, 2019
Elastic cloud enterprise Elastic Cloud Enterprise (ECE)	12	3292	April 10, 2019
ECE failed to install Elastic Cloud Enterprise (ECE)	13	3306	December 21, 2017
ECE install stuck Elastic Cloud Enterprise (ECE)	2	1149	August 20, 2020
ECE Installation Issue Elastic Cloud Enterprise (ECE) docker	3	1014	August 18, 2021

Remove unhealthy ECE node

Related topics