Mapping track path on local disk under docker

Hi
I'm strangling how to change repo for track to local track. I'm trying to load rally benchmark on simply test "--track-path=/rally/.rally/nyc_taxis"
so I've mapped volume with above patch and changed path in track.json
but getting error log that it's not possible

[INFO] Race id is [0036bcbb-8bfd-4f96-9391-04af02039ba0]
[ERROR] Cannot race. Error in task executor
        Not supported URL scheme file

Hello!

Could you provide the full docker command you are using to start up the container?

What platform is the docker host?

Presuming you have mapped a local folder to the docker container at /rally/.rally , does that folder contain the nyc_taxis folder?

Thanks

Gareth

I'm used below command: (yes local folder is mapped to /rally/.rally)

docker run --rm --user root --name esrally_perfomance -v /home/elasticsearch/rally-tracks-master:/rally/.rally/ af2.my_repo/elk/rally:latest race  --track-path=/rally/.rally/nyc_taxis  --target-hosts=http://nginx_es:9500  --pipeline=benchmark-only --client-options="timeout:60,use_ssl:false,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'pass'"
docker --version
Docker version 26.0.0, build 2ae903e
cat /etc/redhat-release
AlmaLinux release 8.6 (Sky Tiger)

so I have change base-url in that track.json to

"corpora": [
    {
      "name": "nyc_taxis",
      "base-url": "file://rally/.rally/nyc_taxis",
      "documents": [
        {
          "target-index": "nyc_taxis",
          "source-file": "documents.json.bz2",
          "#COMMENT": "ML benchmark rely on the fact that the document count stays constant.",
          "document-count": 165346692,
          "compressed-bytes": 4820107188,
          "uncompressed-bytes": 79802445255
        }
      ]
    },
    {
      "name": "nyc_payment_types",
      "base-url": "file://rally/.rally/nyc_taxis",
      "documents": [
        {
          "target-index": "nyc_payment_types",
          "source-file": "nyc_payment_types.json.bz2",
          "#COMMENT": "Manually created ENRICH lookup table for payment_type field",
          "document-count": 5,
          "compressed-bytes": 139,
          "uncompressed-bytes": 245
        }
      ]
    },
    {
      "name": "nyc_rate_codes",
      "base-url": "file://rally/.rally/nyc_taxis",
      "documents": [
        {
          "target-index": "nyc_rate_codes",
          "source-file": "nyc_rate_codes.json.bz2",
          "#COMMENT": "Manually created ENRICH lookup table for rate_code_id field",
          "document-count": 7,
          "compressed-bytes": 166,
          "uncompressed-bytes": 337
        }
      ]
    },
    {
      "name": "nyc_trip_types",
      "base-url": "file://rally/.rally/nyc_taxis",
      "documents": [
        {
          "target-index": "nyc_trip_types",
          "source-file": "nyc_trip_types.json.bz2",
          "#COMMENT": "Manually created ENRICH lookup table for trip_type field",
          "document-count": 2,
          "compressed-bytes": 94,
          "uncompressed-bytes": 75
        }
      ]
    },
    {
      "name": "nyc_vendors",
      "base-url": "file://rally/.rally/nyc_taxis",
      "documents": [
        {
          "target-index": "nyc_vendors",
          "source-file": "nyc_vendors.json.bz2",
          "#COMMENT": "Manually created ENRICH lookup table for vendor_id field",
          "document-count": 2,
          "compressed-bytes": 92,
          "uncompressed-bytes": 67
        }
      ]
    }
  ],

Ah i see what you're trying to do.

The Error you are getting is coming from urllib3, which we use to parse the schema - urllib3/src/urllib3/exceptions.py at main · urllib3/urllib3 · GitHub

You're just wanting to use already downloaded files, instead of downloading each time?

There's two options from what I can see:

  1. instead of mounting the track directory to .rally, have a folder structure as follows: (this option uses the standard rally directory structure, which may avoid other unknown issues later in the benchmark

/home/elasticsearch/.rally/benchmarks/data/nyc_taxis <---- Add your documents.json.bz2 etc here,
/home/elasticsearch/.rally/benchmarks/tracks/default <--- Add your track repo here, without your changes to base-url.

Then update the mount to -v /home/elasticsearch/.rally:/rally/.rally/

  1. Instead of modifying the directory structure, you should be able to remove the base_url component completely, and then test. if the files are there in the same directory as the track.json, rally should then pick it up and use them.

Try that and let me know how you get on :slight_smile:

Gareth

sure, I'll try and get back with feedback