Not able to download track

Hi i am using below script to download track
https://raw.githubusercontent.com/elastic/rally-tracks/master/download.sh

i am able to download tracks on ubuntu 20.04

Cloning into '/home/user/.rally/benchmarks/tracks/default'...
remote: Enumerating objects: 6780, done.
remote: Counting objects: 100% (1341/1341), done.
remote: Compressing objects: 100% (154/154), done.
remote: Total 6780 (delta 1275), reused 1191 (delta 1182), pack-reused 5439
Receiving objects: 100% (6780/6780), 1.71 MiB | 1.72 MiB/s, done.
Resolving deltas: 100% (4602/4602), done.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  252M  100  252M    0     0  10.4M      0  0:00:24  0:00:24 --:--:-- 22.0M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 20985  100 20985    0     0  14452      0  0:00:01  0:00:01 --:--:-- 14452
Created data for geonames in rally-track-data-geonames.tar. Next steps:

1. Copy it to the user home directory on the target machine(s).
2. Extract with tar -xf rally-track-data-geonames.tar (will be extracted to ~/.rally/benchmarks).

but when i try to download on CentOS Stream release 8

i am not getting any error but no data is downloaded

 /home/user/.rally/benchmarks/tracks/default
cloning repo
Cloning into '/home/sfdev/.rally/benchmarks/tracks/default'...
remote: Enumerating objects: 6780, done.
remote: Counting objects: 100% (1341/1341), done.
remote: Compressing objects: 100% (154/154), done.
remote: Total 6780 (delta 1275), reused 1191 (delta 1182), pack-reused 5439
Receiving objects: 100% (6780/6780), 1.71 MiB | 3.74 MiB/s, done.
Resolving deltas: 100% (4602/4602), done.

Created data for geonames in rally-track-data-geonames.tar. Next steps:

1. Copy it to the user home directory on the target machine(s).
2. Extract with tar -xf rally-track-data-geonames.tar (will be extracted to ~/.rally/benchmarks).

Hi Amit,

Yours is the output we would expect if track data was already resident in the target. Can you check if it is there?

$ ls -l ~/.rally/benchmarks/data/geonames
total 259024
-rwxrwxrwx 1 1024 100     20985 May 26 10:04 documents-2-1k.json.bz2
-rwxrwxrwx 1 1024 100 265208777 May 26 10:04 documents-2.json.bz2

If track data is not present in the target, then we'll need to find out why. Could you send the complete output of download.sh executed with bash -x , including the command?

$ bash -x download.sh geonames

Thank you,
Jason

Hi the result with bash -x

+ set -e
+ set -u
+ set -o pipefail
+ readonly ROOT=.rally/benchmarks
+ ROOT=.rally/benchmarks
+ readonly URL=http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora
+ URL=http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora
+ SOURCE=download.sh
+ '[' -h download.sh ']'
+++ dirname download.sh
++ cd -P .
++ pwd
+ readonly CURR_DIR=/home/sfdev/akumars4/applications.benchmarking.benchmark.platform-hero-features/workload/ESRally
+ CURR_DIR=/home/sfdev/akumars4/applications.benchmarking.benchmark.platform-hero-features/workload/ESRally
+ '[' 1 '!=' 1 ']'
+ readonly TRACK=geonames
+ TRACK=geonames
+ TARGETS=()
+ readonly REPO_TARGET=.rally/benchmarks/tracks/default
+ REPO_TARGET=.rally/benchmarks/tracks/default
+ TARGETS[${#TARGETS[*]}]=.rally/benchmarks/tracks/default
+ '[' '!' -d /home/sfdev/.rally/benchmarks/tracks/default ']'
+ git clone https://github.com/elastic/rally-tracks.git /home/sfdev/.rally/benchmarks/tracks/default
Cloning into '/home/sfdev/.rally/benchmarks/tracks/default'...
remote: Enumerating objects: 6780, done.
remote: Counting objects: 100% (1341/1341), done.
remote: Compressing objects: 100% (154/154), done.
remote: Total 6780 (delta 1275), reused 1191 (delta 1182), pack-reused 5439
Receiving objects: 100% (6780/6780), 1.71 MiB | 4.19 MiB/s, done.
Resolving deltas: 100% (4602/4602), done.
+ '[' '!' -d /home/sfdev/.rally/benchmarks/tracks/default/geonames ']'
++ cat /home/sfdev/.rally/benchmarks/tracks/default/geonames/files.txt
+ readonly 'FILES=documents-2.json.bz2
documents-2-1k.json.bz2'
+ FILES='documents-2.json.bz2
documents-2-1k.json.bz2'
+ for f in ${FILES}
+ TARGET_ROOT=.rally/benchmarks/data/geonames
+ TARGET_PATH=.rally/benchmarks/data/geonames/documents-2.json.bz2
+ mkdir -p /home/sfdev/.rally/benchmarks/data/geonames
+ TARGETS[${#TARGETS[*]}]=.rally/benchmarks/data/geonames/documents-2.json.bz2
+ '[' '!' -f /home/sfdev/.rally/benchmarks/data/geonames/documents-2.json.bz2 ']'
+ curl -o /home/sfdev/.rally/benchmarks/data/geonames/documents-2.json.bz2 http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2.json.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
+ for f in ${FILES}
+ TARGET_ROOT=.rally/benchmarks/data/geonames
+ TARGET_PATH=.rally/benchmarks/data/geonames/documents-2-1k.json.bz2
+ mkdir -p /home/sfdev/.rally/benchmarks/data/geonames
+ TARGETS[${#TARGETS[*]}]=.rally/benchmarks/data/geonames/documents-2-1k.json.bz2
+ '[' '!' -f /home/sfdev/.rally/benchmarks/data/geonames/documents-2-1k.json.bz2 ']'
+ curl -o /home/sfdev/.rally/benchmarks/data/geonames/documents-2-1k.json.bz2 http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2-1k.json.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
+ readonly ARCHIVE=rally-track-data-geonames.tar
+ ARCHIVE=rally-track-data-geonames.tar
+ tar -C /home/sfdev --exclude=rally-track-data-geonames.tar -cf rally-track-data-geonames.tar .rally/benchmarks/tracks/default .rally/benchmarks/data/geonames/documents-2.json.bz2 .rally/benchmarks/data/geonames/documents-2-1k.json.bz2
+ echo 'Created data for geonames in rally-track-data-geonames.tar. Next steps:'
Created data for geonames in rally-track-data-geonames.tar. Next steps:
+ echo ''

+ echo '1. Copy it to the user home directory on the target machine(s).'
1. Copy it to the user home directory on the target machine(s).
+ echo '2. Extract with tar -xf rally-track-data-geonames.tar (will be extracted to ~/.rally/benchmarks).'
2. Extract with tar -xf rally-track-data-geonames.tar (will be extracted to ~/.rally/benchmarks).

the path have the documents with 0 size

$ ls -l ~/.rally/benchmarks/data/geonames
total 0
-rw-rw-r-- 1 sfdev sfdev 0 May 26 21:06 documents-2-1k.json.bz2
-rw-rw-r-- 1 sfdev sfdev 0 May 26 21:06 documents-2.json.bz2

Hi i figured out the issue. It was with the settings on my vm.
Thankyou for the response.

Hi Amit, you're welcome!

Hi @json ,

I still se this issue some times on machines.

i saw a stack overflow post which solves the problem.

The -L options solves the problem as mentioned in below blog.
i think this need to be added in the download script.

Can you please share what URL requires a redirect?

Per the script

curl -o /home/user/.rally/benchmarks/data/geopoint/documents.json.bz2 http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geopoint/documents.json.bz2

Below worked

curl -L -o /home/user/.rally/benchmarks/data/geopoint/documents.json.bz2 http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geopoint/documents.json.bz2

as i have shared it above as well

+ curl -o /home/sfdev/.rally/benchmarks/data/geonames/documents-2-1k.json.bz2 http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2-1k.json.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

Can you please share the output of curl -I http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geopoint/documents.json.bz2? I don't see a redirect here, so I don't understand why -L would help.

Anyway, @dliappis added -L in https://github.com/elastic/rally-tracks/pull/274 and used a more suitable URL. Indeed, the one in the previous version of the download.sh script should not be used anymore. Please use this new version and tell us if anything goes wrong.

Thankyou @Quentin_Pradet and @dliappis for quick response.

why don't we use wget instead. it also seems working fine. without any issue.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.