Access for private repositories

I have rally running via docker using existing Elastic clusters. I am trying to make my own custom tracks and I cannot get rally to git clone my private github repository. If I use ssh git, it fails with:

2019-11-18 22:30:07,90 -not-actor-/PID:1 esrally.utils.process INFO b"Cloning into '/rally/.rally/benchmarks/tracks/asset'...\n"
2019-11-18 22:30:07,322 -not-actor-/PID:1 esrally.utils.process INFO b'Host key verification failed.\r\n'
2019-11-18 22:30:07,326 -not-actor-/PID:1 esrally.utils.process INFO b'fatal: Could not read from remote repository.\n'
2019-11-18 22:30:07,327 -not-actor-/PID:1 esrally.utils.process INFO b'\n'
2019-11-18 22:30:07,328 -not-actor-/PID:1 esrally.utils.process INFO b'Please make sure you have the correct access rights\n'
2019-11-18 22:30:07,329 -not-actor-/PID:1 esrally.utils.process INFO b'and the repository exists.\n'

and if I try HTTPS I get:

2019-11-18 22:24:10,324 -not-actor-/PID:1 esrally.utils.process INFO b"fatal: could not read Username for 'https://github.com': No such device or address\n"

Is it possible to use a private Github repository with the elastic/rally docker image?

Also, is it possible to use a Google Cloud bucket that has authentication required to store the compressed data files? Or is only AWS S3 supported?

Hi @mma1,

There are no strict requirements that dissallow private repository use in rally. But, since you are using docker, you need to ensure that you have added the proper bits such that git within the docker image can properly authenticate you. Rally is issuing those git commands from within the image. There are a few ways to do this, either using ssh (which will require you to have a key within the image) or by using a personal access token. Either of these solutions will work, but will require you to add a external volume from your host to the machine.

As for AWS s3 vs Google Cloud, the code that downloads the data is here, and all it does is require a https URL that it can use to download said data. This means you can use any system that can serve your data up over https.

Baz

Thanks for the reply @baz! I did try mounting my ssh key inside of docker but for some reason it didn't work. It's possible I was doing something wrong and will try again.

So there is no way to provide authentication on the URL used to download the data? I see support was added for AWS S3 here https://github.com/elastic/rally/pull/671/files which requires setting up authentication with the aws cli tool. Access to Google would require using a header value with the token.

It may be possible that I can use AWS but just wanted to double check.

I guess the problem I'm having is figuring out how to mount my ssh keys so that rally can use them. Doing something like docker run --rm -v $PWD:/rally/.rally -v ~/.ssh:/rally/.ssh elastic/rally --pipeline=benchmark-only --target-hosts=etc doesn't seem to work. I'm trying various different places to mount them.

Yes you are right, the code in the net module will either download via s3 using boto or download via http. There is no explicit code to download using a library compatible with gcp. I can bring this back to the team as an enhancement request.

rally is running as the rally user in docker, created with the home directory /rally and UID:GID 1000. You will need to ensure that you put your ssh credentials into /rally/.ssh and that they are chowned to 1000 so that ssh will allow them to be read rather than disregarding them. I believe this goes for the folder itself too. ssh is very picky about the files ownership and permissions granted. See if this works for you.

Below is a test from within the elastic/rally image.

rally@88dd35a9cbcc:~$ whoami
rally
rally@88dd35a9cbcc:~$ echo $HOME
/rally
rally@88dd35a9cbcc:~$ getent passwd rally
rally:x:1000:1000::/rally:/bin/bash

Thanks that was helpful I got a little further in that now I get this error:

2019-11-19 19:53:14,860 ActorAddr-(T|:39423)/PID:13 esrally.utils.process INFO b'Load key "/rally/.ssh/id_rsa": Permission denied\r\n'
2019-11-19 19:53:14,861 ActorAddr-(T|:39423)/PID:13 esrally.utils.process INFO b'Permission denied (publickey).\r\n'
2019-11-19 19:53:14,862 ActorAddr-(T|:39423)/PID:13 esrally.utils.process INFO b'fatal: Could not read from remote repository.\n'
2019-11-19 19:53:14,863 ActorAddr-(T|:39423)/PID:13 esrally.utils.process INFO b'\n'
2019-11-19 19:53:14,863 ActorAddr-(T|:39423)/PID:13 esrally.utils.process INFO b'Please make sure you have the correct access rights\n'
2019-11-19 19:53:14,863 ActorAddr-(T|:39423)/PID:13 esrally.utils.process INFO b'and the repository exists.\n'
2019-11-19 19:53:14,923 ActorAddr-(T|:39423)/PID:13 esrally.track.loader ERROR Cannot load track [None]

I can see it trying to load the id_rsa key but getting permission denied. Here is my ssh directory:

total 80
drwxr-xr-x@   7 1000  1000    224B Nov 19 14:53 .
drwxrwxrwt   11 root  wheel   352B Nov 19 14:47 ..
-rwx------    1 1000  1000    106B Nov 19 14:47 config
-rwx------    1 1000  1000    3.2K Nov 19 14:47 id_rsa
-rwx------    1 1000  1000    748B Nov 19 14:47 id_rsa.pub
-rw-r--r--@   1 1000  1000     24K Nov 19 14:47 known_hosts

Do you see what else could be missing?

Change the id_rsa files with chmod 600 but same error.

drwxr-xr-x@   7 1000  1000    224B Nov 19 15:33 .
drwxrwxrwt   11 root  wheel   352B Nov 19 15:28 ..
-rwx------@   1 1000  1000    106B Nov 19 15:28 config
-rw-------@   1 1000  1000    3.2K Nov 19 15:28 id_rsa
-rw-------@   1 1000  1000    748B Nov 19 15:28 id_rsa.pub
-rw-r--r--@   1 1000  1000     24K Nov 19 15:28 known_hosts

just to make sure, does this private key work to check out that repo on your host machine?

Yeah, that .ssh setup is what I've been using for years at work. I've decided to skip using the docker image for now and use the esrally binary directly, which is working as I would expect.

Ill spend some time tomorrow testing the actual docker image as well, something definitely seems wrong. Your setup seems quite sane, its maybe being invoked by a different user...

One more Q, do you have a passphrase associated w/ this ssh key? You may have put it in your local ssh-agent and not have to normally enter it. This could also be causing issue.

Hrmm it's possible I do. I can make another ssh key without one as a test, to confirm that.

New key with no passphrase is still giving me problems (but possible I screwed something up). Would I also have to add the key to the ssh-agent inside the docker image?

So i went ahead and did a test to make sure it was working. I spun up a passphrase free key and tested via

% docker run -dit -v /home/baz/.ssh_keyfree/:/rally/.ssh elastic/rally bash
% docker exec -it ddeb184e7c8c bash

rally@ddeb184e7c8c:~$ git clone git@github.com:elastic/rally-tracks.git
Cloning into 'rally-tracks'...
remote: Enumerating objects: 4225, done.
remote: Total 4225 (delta 0), reused 0 (delta 0), pack-reused 4225
Receiving objects: 100% (4225/4225), 1.20 MiB | 0 bytes/s, done.
Resolving deltas: 100% (2898/2898), done.

And then tried again without importing the key directory and from in the container got

rally@442eb19e3a8c:~$ git clone git@github.com:elastic/rally-tracks.git
Cloning into 'rally-tracks'...
The authenticity of host 'github.com (140.82.114.3)' can't be established.
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
Are you sure you want to continue connecting (yes/no)? ^C

Im running all this on linux. A colleague mentioned there is a long standing issue w/ macos detailed here that could be causing your issue, if you are on macos.

But rest assured that this does work in a linux+docker environment with the rally container, as per the example above.

1 Like