Issues with installing elastic-cloud-enterprise-installer


(Wesley) #1

Hi.

I am trying to install Elastic Cloud Enterprise on a Ubuntu 14 machine.

Got everything in place, but the installer is stuck on Loaded bootstrap settings.
Even after waiting for more then 24 hours the installer is still stuck at Loaded bootstrap settings.

The logging that I got when i interrupt the installer is:

Traceback (most recent call last):
File "/elastic_cloud_apps/bootstrap-initiator/initiator.py", line 68, in
monitor.logging_and_bootstrap_monitor(bootstrap_properties, enable_debug)
File "/elastic_cloud_apps/bootstrap-initiator/bootstrap_initiator/monitor.py", line 18, in logging_and_bootstrap_monitor
sleep(5)
KeyboardInterrupt
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/elastic_cloud_apps/bootstrap-initiator/bootstrap_initiator/logging.py", line 73, in monitor_bootstrap_log
process_log(bootstrap_properties, debug_enabled)
File "/elastic_cloud_apps/bootstrap-initiator/bootstrap_initiator/logging.py", line 47, in process_log
p = tail("-f", "{0}/logs/bootstrap-logs/bootstrap.log".format(bootstrap_properties['HOST_STORAGE_PATH']), _out=process_info_log_output)
File "/usr/lib/python3.5/site-packages/sh.py", line 1021, in call
return RunningCommand(cmd, call_args, stdin, stdout, stderr)
File "/usr/lib/python3.5/site-packages/sh.py", line 486, in init
self.wait()
File "/usr/lib/python3.5/site-packages/sh.py", line 493, in wait
exit_code = self.process.wait()
File "/usr/lib/python3.5/site-packages/sh.py", line 1601, in wait
pid, exit_code = os.waitpid(self.pid, 0) # blocks

Could someone give me a push in the right direction?


ECE installation hanging after Loaded bootstrap settings
(Uri Cohen) #2

Hi @Henk

Can you run the installer again with the --debug flag and post the output here?

Thanks,
Uri


(Wesley) #3

Hi @uricohen,

Thank you for your quick reaction.

I started the installer in debug mode, and the installer is stuck on the following action.

[2017-01-11 11:50:29,632][INFO ][org.apache.curator.framework.imps.CuratorFrameworkImpl][] Default schema

  • [2017-01-11 11:50:31,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Unable to read servers list from [http//xxx.xxx.xxx.xxx:2180/zookeeper/clients/ensemble/connection-string?namespace=/v1], falling back to [0.0.0.0:2181]
  • [2017-01-11 11:50:31,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Resolved connection string from [http://xxx.xxx.xxx.xxx2180/zookeeper/clients/ensemble/connection-string?namespace=/v1] to [0.0.0.0:2181/v1] with local namespace [/v1]
  • [2017-01-11 11:50:33,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Unable to read servers list from [http//xxx.xxx.xxx.xxx:2180/zookeeper/clients/ensemble/connection-string?namespace=/v1], falling back to [0.0.0.0:2181]
  • [2017-01-11 11:50:33,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Resolved connection string from [http://xxx.xxx.xxx.xxx2180/zookeeper/clients/ensemble/connection-string?namespace=/v1] to [0.0.0.0:2181/v1] with local namespace [/v1]
  • [2017-01-11 11:50:35,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Unable to read servers list from [http//xxx.xxx.xxx.xxx:2180/zookeeper/clients/ensemble/connection-string?namespace=/v1], falling back to [0.0.0.0:2181]
  • [2017-01-11 11:50:35,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Resolved connection string from [http://xxx.xxx.xxx.xxx2180/zookeeper/clients/ensemble/connection-string?namespace=/v1] to [0.0.0.0:2181/v1] with local namespace [/v1]
  • [2017-01-11 11:50:37,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Unable to read servers list from [http//xxx.xxx.xxx.xxx:2180/zookeeper/clients/ensemble/connection-string?namespace=/v1], falling back to [0.0.0.0:2181]
  • [2017-01-11 11:50:37,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Resolved connection string from [http://xxx.xxx.xxx.xxx2180/zookeeper/clients/ensemble/connection-string?namespace=/v1] to [0.0.0.0:2181/v1] with local namespace [/v1]
  • [2017-01-11 11:50:39,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Unable to read servers list from [http//xxx.xxx.xxx.xxx:2180/zookeeper/clients/ensemble/connection-string?namespace=/v1], falling back to [0.0.0.0:2181]
  • [2017-01-11 11:50:39,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Resolved connection string from [http://xxx.xxx.xxx.xxx2180/zookeeper/clients/ensemble/connection-string?namespace=/v1] to [0.0.0.0:2181/v1] with local namespace [/v1]
  • [2017-01-11 11:50:41,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Unable to read servers list from [http//xxx.xxx.xxx.xxx:2180/zookeeper/clients/ensemble/connection-string?namespace=/v1], falling back to [0.0.0.0:2181]
  • [2017-01-11 11:50:41,631][INFO ][no.found.curator.ForwardedEnsembleProvider][] Resolved connection string from [http://xxx.xxx.xxx.xxx2180/zookeeper/clients/ensemble/connection-string?namespace=/v1] to [0.0.0.0:2181/v1] with local namespace

(Uri Cohen) #4

Hi @Henk

This looks like a connectivity issue. Is the public hostname you specified at the beginning of the installation routable from within the host? If you're on AWS for example, you should use the AWS public hostname and not public ip.


(Wesley) #5

Thanks @uricohen.

I fixed the issue and the installer is now hanging on the following.

  • [2017-01-11 15:12:47,789][INFO ][no.found.docker.DockerContainerManager][ec_container_group=runners, ec_container_kind=docker, ec_container_name=runner] Starting container
  • [2017-01-11 15:12:47,965][INFO ][no.found.bootstrap.BootstrapInitial][] Started local runner
  • [2017-01-11 15:12:47,968][INFO ][no.found.bootstrap.BootstrapInitial][] Waiting for runner container node
  • [2017-01-11 15:12:51,122][INFO ][no.found.bootstrap.BootstrapInitial][] Runner container node detected
  • [2017-01-11 15:12:51,191][INFO ][no.found.bootstrap.BootstrapInitial][] Waiting for coordinator candidate
  • [2017-01-11 15:13:00,655][INFO ][no.found.bootstrap.BootstrapInitial][] Detected coordinator candidate
  • [2017-01-11 15:13:00,659][INFO ][no.found.bootstrap.BootstrapInitial][] Detected pending coordinator, promoting coordinator
  • [2017-01-11 15:13:00,951][INFO ][no.found.zookeeper.models.coordinators.CoordinatorCandidatePendingInfo][] Current instances: [4294967319,4294967319,1484147566897,1484147566897,0,0,0,0,2,0,4294967319], secrets: [4294967321,4294967321,1484147566908,1484147566908,0,0,0,0,19609,0,4294967321]
  • [2017-01-11 15:13:00,998][INFO ][no.found.bootstrap.BootstrapInitial][] Coordinator accepted
  • [2017-01-11 15:13:00,999][INFO ][no.found.bootstrap.BootstrapInitial][] Storing current platform version: 1.0.0-alpha4
  • [2017-01-11 15:13:01,153][INFO ][no.found.bootstrap.BootstrapInitial][] Storing Elastic Stack versions: [2.4.1,5.0.2]
  • [2017-01-11 15:13:01,166][INFO ][no.found.bootstrap.BootstrapInitial][] Creating Admin Console Elasticsearch backend
  • [2017-01-11 15:13:01,343][INFO ][no.found.bootstrap.ServiceLayerBootstrap][] Waiting for [ensuring-plan] to complete. Retrying every [1 second] (cause: [org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /clusters/e3dbb993ef0f4502b0ff162493216e7c/plans/status])
  • [2017-01-11 15:13:06,451][INFO ][no.found.bootstrap.ServiceLayerBootstrap][] Waiting for [ensuring-plan] to complete. Retrying every [1 second] (cause: [java.lang.Exception: not yet started])
  • [2017-01-11 15:13:07,479][INFO ][no.found.bootstrap.BootstrapInitial][] Applying Elasticsearch index template from: [HttpRequest(GET,http://containerhost:9200,List(X-Found-Cluster: e3dbb993ef0f4502b0ff162493216e7c, Authorization: Basic YWRtaW46dU1HdmZyeTMzcXMyZ0x0VXBBSGlVTlY4bVdBNTBObUtaZ3I2bGdlY3BDUT0=),Empty,HTTP/1.1)], uMGvfry33qs2gLtUpAHiUNV8mWA50NmKZgr6lgecpCQ=
  • [2017-01-11 15:13:08,497][INFO ][no.found.bootstrap.ServiceLayerBootstrap][] Waiting for [apply-elasticsearch-template] to complete. Retrying every [1 second] (cause: [spray.can.Http$ConnectionException: ErrorClosed(Connection reset by peer)])
  • [2017-01-11 15:23:07,505][INFO ][no.found.bootstrap.ServiceLayerBootstrap][] Waiting for [apply-elasticsearch-template] to complete. Retrying every [1 second] (cause: [akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://default/user/IO-HTTP#-1376349401]] after [18010000 ms]])
  Errors have caused Elastic Cloud Enterprise installation to fail
  Node type - initial
  

(Uri Cohen) #6

Hi @Henk

This is the installer trying to create an index in the Elasticsearch admin cluster which backs the elastic cloud enterprise API server. It looks like Elasticsearch didn't start, can you run "docker ps" and paste the output here?

Thanks
Uri


(Wesley) #7

Hi @uricohen.

First of all thanks for helping me again :).

The output I get when i run "docker ps" is.

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d44ab6217b1a docker.elastic.co/cloud-enterprise/elasticsearch:2.4.1-1 "/sbin/my_init --skip" 16 hours ago Up 16 hours 0.0.0.0:18025->18025/tcp, 0.0.0.0:19810->19810/tcp fac-e3dbb993ef0f4502b0ff162493216e7c-instance-0000000000
be4f057735ba docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours 0.0.0.0:12400->5601/tcp, 0.0.0.0:12443->5643/tcp frc-cloud-uis-cloud-ui
0af6cda52300 docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours frc-constructors-constructor
afaef2a0e8f7 docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours 0.0.0.0:8764->8764/tcp, 0.0.0.0:12300->12300/tcp, 0.0.0.0:12343->12343/tcp frc-admin-consoles-admin-console
58a99637297e docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours frc-upgraders-upgrader
2b34b8785217 docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours frc-services-forwarders-services-forwarder
4cc24f06f2b8 docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours 0.0.0.0:2112->2112/tcp frc-directors-director
722b5240662e docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours 0.0.0.0:12375->12375/tcp frc-upgradables-upgradable
1c714be55722 docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours 0.0.0.0:9200->9200/tcp, 0.0.0.0:9243->9243/tcp, 0.0.0.0:9300->9300/tcp, 0.0.0.0:9343->9343/tcp frc-proxies-proxy
c84e3ab0e1bd docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours frc-allocators-allocator
7101dcd4808f docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours frc-blueprints-blueprint
bee251cd7885 docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours frc-runners-runner
0ee67b7e183d docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours frc-client-forwarders-client-forwarder
0fc64798b955 docker.elastic.co/cloud-enterprise/elastic-cloud-enterprise:1.0.0-alpha4 "/sbin/my_init" 16 hours ago Up 16 hours 0.0.0.0:2191->2191/tcp, 0.0.0.0:12191->12191/tcp, 0.0.0.0:12898->12898/tcp, 0.0.0.0:13898->13898/tcp frc-zookeeper-servers-zookeeper


(Alex Piggott) #8

Hi @Henk

Sorry you're having issues. It looks like there is a problem connecting to the elasticsearch instance that holds a lot of the admin information

Can you try the following curl commands:

curl -H 'x-found-cluster: e3dbb993ef0f4502b0ff162493216e7c' 'localhost:18025'
curl -H 'x-found-cluster: e3dbb993ef0f4502b0ff162493216e7c' 'localhost:9200'
curl -H 'x-found-cluster: e3dbb993ef0f4502b0ff162493216e7c' 'localhost:9244'

They should all return the usual Elasticsearch welcome. If they all fail, it is very likely that the elasticsearch instance is having issues. In that case can you have a look at the logs obtained in /app/logs from (eg):

docker exec -it  fac-e3dbb993ef0f4502b0ff162493216e7c-instance-0000000000 bash

If the first one succeeds and the second one fails, the likely issue is the proxy, check the logs for:

docker exec -it frc-proxies-proxy bash

and if the third one fails, it is likely the services forwarder, check the logs for:

docker exec -it frc-services-forwarders-services-forwarder bash

You can find/capture all the logs with the handy command:

tar czvf ece-logs.tgz $(find /mnt/data/elastic -name "*.log" -o -name "*.out")

(Wesley) #10

Thanks for helping me out :).

Elastic Cloud Enterprise Installer Completed Successfully


(system) #11

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.