****** Use this pipeline only if you are aware of the tradeoffs. ******
*************************** Watch your step! ***************************
[INFO] Racing on track [geonames], challenge [append-no-conflicts-index-only] and car ['external'] with version [2.3.5].
[WARNING] refresh_total_time is 2 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 4 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[ERROR] Cannot race. Error in track preparator
Cannot find /home/.rally/benchmarks/data/geonames/documents-2.json.bz2. Please disable offline mode and retry again.
Getting further help:
Check the log files in /root/.rally/logs for errors.
however, there is documents.json.bz2 file:
[root@hdh154 /home/.rally/benchmarks/data/geonames]# ll
total 193224
-rwxrwxrwx 1 root root 197857614 Jun 23 15:37 documents.json.bz2
could you help to resolve it please? thank you so much
I have download them, and put them under another path:
[root@hdh154 ~/.rally/benchmarks/tracks/default/geonames]# ll
total 660
drwxrwxrwx 2 root root 4096 Jun 20 11:27 challenges
-rwxrwxrwx 1 root root 44 Jun 20 10:39 files.txt
-rw-r--r-- 1 root root 2685 Jun 20 10:39 index.json
drwxrwxrwx 2 root root 4096 Jun 20 10:39 operations
-rw-r--r-- 1 root root 2061 Jun 20 10:39 README.md
-rw-r--r-- 1 root root 642669 Jun 20 10:39 terms.txt
-rwxrwxrwx 1 root root 1196 Jun 20 14:13 track.json
-rw-r--r-- 1 root root 4192 Jun 20 10:39 track.py
Is there some problem?
So there is a file called documents.json.bz2 but Rally expects documents-2.json.bz2. Renaming the file won't help because it seems you downloaded and older version (the expected size is 264698741 bytes but yours has 197857614 bytes). How did you download the file? Did you use the download.sh script in rally-tracks as suggested in the docs?
thank you for your reply.
Cause our server can't connect to network, so where can I get the documents-2.json.bz2 manually please?
Maybe I can try to replace the the documents.json.bz2 to documents-2.json.bz2, and try again.
That is what the download.sh script is for. It will download all track-related files that you need to run in offline mode.
As I have explained in my first answer, this will not help you. This is the wrong file to begin with. The file length does not match and also it will have the wrong file structure.
Is there a download.sh file in that directory? If not, please follow the instructions in the docs.
I have got the file, however, still some error:
[INFO] Racing on track [geonames], challenge [append-no-conflicts-index-only] and car ['external'] with version [2.3.5].
[INFO] Decompressing track data from [/home/.rally/benchmarks/data/geonames/documents-2.json.bz2] to [/home/.rally/benchmarks/data/geonames/documents-2.json] (resulting size: 3.30 GB) ...
[ERROR] Cannot race. Error in track preparator
Invalid data stream
indeed, the file size of documents-2.json.bz2 seems fine. Can you please delete the partially uncompressed file documents-2.json (i.e. rm -f documents-2.json) and retry?
[INFO] Racing on track [geonames], challenge [append-no-conflicts-index-only] and car ['external'] with version [2.3.5].
[INFO] Decompressing track data from [/home/.rally/benchmarks/data/geonames/documents-2.json.bz2] to [/home/.rally/benchmarks/data/geonames/documents-2.json] (resulting size: 3.30 GB) ...
[WARNING] refresh_total_time is 36 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 33 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[ERROR] Cannot race. Error in track preparator
Invalid data stream
Run md5sum documents-2.json.bz2 and paste the result. The fingerprint should be c6fbf5e7b20c3c46f4cd6ab8385a9cb7.
Remove documents-2.json once more with rm -f documents-2.json
Attempt to decompress it yourself with bzip2 -dk documents-2.json.bz2. Note that this requires the bzip2 tool which you might need to install separately.
I've downloaded it through web browser, and here's output:
[root@hdh154 /home/.rally/benchmarks/data/geonames]# md5sum documents-2.json.bz2
3682050a7eb9dc53ba923379bc4a4ea3 documents-2.json.bz2
Downloading via a browser is fine as well. I just don't understand why the artefact is corrupted on your machine. I tried that exact command now on several machines with the same (successful) result. Are you behind a proxy? Do you have an opportunity to download the file from a machine that is not behind that proxy?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.