Hello, first post here. I tried to lookup existing topics, but found none that fit the description I was having. I manually downloaded the ECE script to my runner host and attempted to run the install script to add my runner with allocator and proxy roles.
I attempted to use an AWS Elastic File System mounted on my host at /mnt/efs/data/elastic
for mounting the persisted container(s) data specified by --host-storage-path
and ran into an issue. However, I was ultimately able to find a workaround so I wanted to report that here for feedback and get advice on any downstream impacts it may have to the resiliency of my platform.
For full transparency, /etc/fstab
of my runner host contains the appropriate entry for the root device storage path of the mounted nfs4 volume:
<AWS-EFS-FILE-SYSTEM-ID>.efs.<MY-AWS-REGION>.amazonaws.com:/ /mnt/efs/data nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport,_netdev 0 0
To ensure an available directory exists at the specified /mnt/efs/data/elastic
, I set permission modes and owner/group privileges before install.
[elastic@ip-XXX-XXX-XXX-XXX ~]$ sudo install -o $USER -g $USER -d -m 700 /mnt/efs/data/elastic
It should be noted that I am using one of the pre-configured elastic CentOS 8 community AMIs indicated in the current-amis.json list.
The run command of my EC2 instance contains the necessary logic (auto-generated by AWS) which mounts the nfs4 volume on the target host and is executed/tested on instance start. I ran the following install command and received a Java/Scala parsing exception error:
[elastic@ip-XXX-XXX-XXX-XXX ~]$ bash <(curl -fsSL https://download.elastic.co/cloud/elastic-cloud-enterprise.sh) install --debug \
--overwrite-existing-image \
--roles "proxy,allocator" \
--memory-settings '{"runner":{"xms":"1G","xmx":"1G"},"allocator":{"xms":"4G","xmx":"4G"},"proxy":{"xms":"8G","xmx":"8G"},"zookeeper":{"xms":"4G","xmx":"4G"},"director":{"xms":"1G","xmx":"1G"},"constructor":{"xms":"4G","xmx":"4G"},"admin-console":{"xms":"4G","xmx":"4G"}}' \
--cloud-enterprise-version "2.6.2" \
--availability-zone "<MY_ECE_ZONE>" \
--coordinator-host "<MY_COORDINATOR_HOST_IP>" \
--host-ip "<MY_RUNNER_HOST_IP>" \
--host-docker-host /var/run/docker.sock \
--host-storage-path /mnt/efs/data/elastic \
--roles-token \'${ECE_INSTALL_ROLES_TOKEN}\' \
--external-hostname "<MY_EXTERNAL_HOSTNAME>" \
--api-base-url "<MY_API_BASE_URL>"
It is my understanding the error was caused by an apparent colon (":") that is post-fixed to the resolved field HOST_STORAGE_DEVICE_PATH
in bootstrap container config file /elastic_cloud_apps/bootstrap/bootstrap.conf
at line 20:
1 include "reference"
2 include "application-bootstrap"
3 include file("/elastic_cloud_apps/additional.conf")
.
.
.
17 host {
18 storage-path = ${HOST_STORAGE_PATH}
19 storage-root-volume-path = ${HOST_STORAGE_ROOT_VOLUME_PATH}
20 storage-device-path = ${HOST_STORAGE_DEVICE_PATH} <============== X
21 docker-config-path = ${?HOST_DOCKER_CONFIG_PATH}
22 }
23
.
.
.
41 }
When I set the volume mount to /mnt/efs/data/elastic
, the HOST_STORAGE_DEVICE_PATH
resolves to the Filesystem name specified by the 'disk free' command used to obtain it (truncated output below).
[elastic@ip-XXX-XXX-XXX-XXX data]$ df -hT
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/lxc-data xfs 119G 16G 104G 13% /mnt/data
<AWS-EFS-FILE-SYSTEM-ID>.efs.<MY-AWS-REGION>.amazonaws.com:/ nfs4 8.0E 0 8.0E 0% /mnt/efs/data
During install, bootstrap execution appears to work fine until an "invalid volume specification" error occurs in the creation of runner container [frc-runners-runner]
noted by the log output below where <INSTALL-HOST-IP>
, <MY-COORDINATOR-HOST-IP>
, <AWS-EFS-FILE-SYSTEM-ID>
, and <MY-AWS-REGION>
correspond to their respective values:
.
.
.
[2021-06-09 01:51:11,015][INFO ][org.apache.curator.framework.state.ConnectionStateManager] State change: CONNECTED {}
[2021-06-09 01:51:11,056][INFO ][no.found.bootstrap.BootstrapAdditional] Starting local runner {}
[2021-06-09 01:51:11,059][INFO ][no.found.bootstrap.containers.RunnerContainerBootstrap] Bootstrapping container [runners-runner] {}
[2021-06-09 01:52:41,382][INFO ][no.found.bootstrap.BootstrapAdditional$] Api Exception: {}
no.found.docker.DockerApiException: Unable to create container [frc-runners-runner]
Docker API request: [HttpRequest(HttpMethod(POST),http://localhost:2375/v1.22/containers/create?name=frc-runners-runner,-,-,HttpProtocol(HTTP/1.1)) [INJECTED BYTECODE STRING REDACTION: Until https:/ist(Api-Version: 1.40, Docker-Experimental: false, Ostype: linux, Server: Docker/19.03.13 (linux), X-Content-Type-Options: nosniff, Date: Wed, 09 Jun 2021 01:52:41 GMT), invalid volume specification: '<AWS-EFS-FILE-SYSTEM-ID>.efs.<MY-AWS-REGION>.amazonaws.com::<AWS-EFS-FILE-SYSTEM-ID>.efs.<MY-AWS-REGION>.amazonaws.com:'
, HttpProtocol(HTTP/1.1)))]
at no.found.docker.DockerApiException$.apply(DockerApiException.scala:93)
at no.found.docker.DockerApiException$.apply(DockerApiException.scala:97)
at no.found.docker.DockerApi.$anonfun$createContainer$1(DockerApi.scala:272)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[2021-06-09 01:52:41,405][ERROR][scala.Predef$ ] Uncaught throwable occurred on thread: [main], calling System.exit(1) {}
no.found.docker.DockerApiException: Unable to create container [frc-runners-runner]
Docker API request: [HttpRequest(HttpMethod(POST),http://localhost:2375/v1.22/containers/create?name=frc-runners-runner,-,-,HttpProtocol(HTTP/1.1)) [INJECTED BYTECODE STRING REDACTION: Until https:/ist(Api-Version: 1.40, Docker-Experimental: false, Ostype: linux, Server: Docker/19.03.13 (linux), X-Content-Type-Options: nosniff, Date: Wed, 09 Jun 2021 01:52:41 GMT), invalid volume specification: '<AWS-EFS-FILE-SYSTEM-ID>.efs.<MY-AWS-REGION>.amazonaws.com::<AWS-EFS-FILE-SYSTEM-ID>.efs.<MY-AWS-REGION>.amazonaws.com:'
, HttpProtocol(HTTP/1.1)))]
at no.found.docker.DockerApiException$.apply(DockerApiException.scala:93)
at no.found.docker.DockerApiException$.apply(DockerApiException.scala:97)
at no.found.docker.DockerApi.$anonfun$createContainer$1(DockerApi.scala:272)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[2021-06-09 01:52:41,411][INFO ][no.found.util.LogApplicationExit$] Application is exiting {}
To debug my issue, I downloaded a local copy of the ECE script and slightly edited the function responsible for setting the HOST_STORAGE_DEVICE_PATH
("createAndValidateHostStoragePath()") to the same path indicated by HOST_STORAGE_PATH
for my very specific use-case, at which point, I was able to complete a successful install.
createAndValidateHostStoragePath() {
uid=`id -u`
gid=`id -g`
if [[ ! -e ${HOST_STORAGE_PATH} ]]; then
mkdir -p ${HOST_STORAGE_PATH}
chown -R $uid:$gid ${HOST_STORAGE_PATH}
fi
if [[ ! -r ${HOST_STORAGE_PATH} ]]; then
printf "${RED}%s${NC}\n" "Host storage path ${HOST_STORAGE_PATH} exists but doesn't have read permissions for user '${USER}'."
printf "${RED}%s${NC}\n" "Please supply the correct permissions for the host storage path."
exit $GENERAL_ERROR_EXIT_CODE
fi
if [[ ! -w ${HOST_STORAGE_PATH} ]]; then
printf "${RED}%s${NC}\n" "Host storage path ${HOST_STORAGE_PATH} exists but doesn't have write permissions for user '${USER}'."
printf "${RED}%s${NC}\n" "Please supply the correct permissions for the host storage path."
exit $GENERAL_ERROR_EXIT_CODE
fi
# ORIGINAL
# export HOST_STORAGE_DEVICE_PATH=$(df --output=source ${HOST_STORAGE_PATH} | sed 1d)
# ********** MY EDIT **********
export HOST_STORAGE_DEVICE_PATH=${HOST_STORAGE_PATH}
}
Despite the fact the bootstrap initiator container (elastic-cloud-enterprise-installer
) docker run
arguments use -v ${HOST_STORAGE_PATH}:${HOST_STORAGE_PATH}
to bind mount the volume, it appears the subsequent frc-runners-runner
container attempts to use -v ${HOST_STORAGE_
DEVICE_PATH}
:${HOST_STORAGE_
DEVICE_PATH}
which invariably results in the "invalid volume specification" error I reported earlier.
To retain the original intent of the exported variable, I also attempted to use the following subshell command replacement which would permit the device path EFS DNS name in the install checks (as seen in line 20 bootstrap.conf
), which also proved unsuccessful:
HOST_STORAGE_DEVICE_PATH=$(df --output=source ${HOST_STORAGE_PATH} | sed 1d | awk -v efs_path=${HOST_STORAGE_PATH} '{split($0,a,":"); print a[1]"\042:"efs_path"\042"}')
It seems (from my observations) that HOST_STORAGE_DEVICE_PATH
is a redundant environment variable in the given context. Any input on this would be appreciated.
I have a premium orchestration license but posed the question here for the sake of anyone else who might have a similar use-case.
Any input would be appreciated.