Thanks so much for testing out ECE and for reporting this issue! What seems to be happening here is that the installation is missing the Logging and Metrics cluster, but I'm not sure how that could have happened. Can you verify that you see this cluster in the cluster list?
Also could you please get us the response from the API call that the UI is seeing for any cluster that is experiencing this issue. If you open up the dev tools in a browser, and navigate to the network tab, you'll see the api/v0.1/regions/ece-region/clusters/eb1f0e8c82dc4425b32250fb5b3d0ec5 API call. Can you copy/paste the response into here?
I had a quick look at all things that could cause hrefs.logging and hrefs.metrics to be missing from the API call @Andrew_Moldovan mentioned, which is the cause of that faulty endpoint
The most likely is that the install failed before the location of the logging cluster could be written. Could you check your /mnt/data/elastic/logs/bootstrap-logs/bootstrap.log, at the end of the logs I have a set of commands like:
[2017-06-20 18:19:51,848][INFO ][no.found.bootstrap.BootstrapInitial] Enabling Kibana for Logging and Metrics Cluster {}
[2017-06-20 18:19:52,400][INFO ][no.found.bootstrap.ServiceLayerBootstrap] waiting for cluster [KibanaCluster(6434b73e48ae464997a3ad4e8e65eec7)] {}
[2017-06-20 18:19:52,407][INFO ][no.found.bootstrap.ServiceLayerBootstrap] Waiting for [waiting-for-cluster] to complete. Retrying every [1 second] (c
ause: [java.lang.Exception: not yet started]) {}
[2017-06-20 18:24:24,664][INFO ][no.found.bootstrap.BootstrapInitial] Shutting down bootstrapper {}
Do you have any timeout errors there at all?
(We have been seeing some issues where the docker Kibana download and the subsequent cluster provisioning can take more than the 10 minute timeout we allocate - we are upping that timeout and also updating the docs to explain how to address it for 1.0.2)
[2017-06-09 13:46:05,968][INFO ][no.found.bootstrap.ServiceLayerBootstrap] Waiting for [ensuring-plan] to complete. Retrying every [1 second] (cause: [java.lang.Exception: not yet started]) {}
[2017-06-09 13:46:34,375][INFO ][no.found.bootstrap.BootstrapInitial] Enabling Kibana for Logging and Metrics Cluster {}
[2017-06-09 13:46:34,673][INFO ][no.found.bootstrap.ServiceLayerBootstrap] waiting for cluster [KibanaCluster(bf7f8aeda102430381bb515c06e5d9c6)] {}
[2017-06-09 13:46:34,696][INFO ][no.found.bootstrap.ServiceLayerBootstrap] Waiting for [waiting-for-cluster] to complete. Retrying every [1 second] (cause: [java.lang.Exception: not yet started]) {}
[2017-06-09 13:56:34,678][ERROR][no.found.bootstrap.BootstrapInitial$] Unhandled error. {}
java.util.concurrent.TimeoutException: Futures timed out after [600 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at no.found.bootstrap.BootstrapInitial.bootstrapLoggingMetricsCluster(BootstrapInitial.scala:958)
at no.found.bootstrap.BootstrapInitial.bootstrap(BootstrapInitial.scala:611)
at no.found.bootstrap.BootstrapInitial$.delayedEndpoint$no$found$bootstrap$BootstrapInitial$1(BootstrapInitial.scala:1153)
at no.found.bootstrap.BootstrapInitial$delayedInit$body.apply(BootstrapInitial.scala:1147)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at no.found.bootstrap.BootstrapInitial$.main(BootstrapInitial.scala:1147)
at no.found.bootstrap.BootstrapInitial.main(BootstrapInitial.scala)
And about the kibana, I also notice that the provisionning or moving kibana's nodes from an allocator to another is very very very long (well, in fact, I cancel and re-apply the modification to go on). Should I open a new topic to explain what I observed ?
They do confirm that the install timed out right at the end (because of the combination of time taken to download the Kibana docker image, and time to provision).
As I mentioned we are addressing that timeout problem in July's release. In the meantime getting a working version unfortunately requires blowing away and re-installing:
(importantly this preserves the Elasticsearch and Kibana docker images)
then re-install, and it shouldn't time out during the install (and hence the logging and metrics links will work)
It might be worth re-trying the Kibana re-allocation with the new install, though I am not aware of any steps at the end of the installatiion that would affect it, so it will likely persist - please do open up a new thread and we'll look into that
@JohnnyB Don't worry about posting the har file now, we now know the issue as @Alex_Piggott described.
The 404 for phone-home/data that you are seeing is because you opted-out of sending us metrics information about your installation. We had a bug in the UI that would continuously try and fetch that data even though you had opted out (edit: though it would not do so because of the opt out, hence the error). The bug is fixed and you shouldn't see that anymore in the July release. Sorry about that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.