Unable to run a benchmark on a 3 node Elastic-Search Cluster

When running a race on a 3 node Elasticsearch cluster we are getting the following error. We were able to run a benchmark on a single node cluster.
The single node had a document count of 3,556,667 , whereas the 3 node cluster has a document count of 35,845,521

Version of Rally - 2.8.0
Version of Elastic-Search - 7.17.0
The document count of index - 35,845,521

Followed the instructions in this link

Tips and Tricks - Rally 2.8.0 documentation (esrally.readthedocs.io)

[INFO] Race id is [d5179e91-1c44-44a6-9f57-72bcd08891a2]

[ERROR] Cannot race. Error in task executor
Cannot download data because no base URL is provided. "

We tried running the benchmark with the Elastic Search DNS and also with the IP's

esrally race --track=/home/ec2-user/tracks/ES-Cluster --target-hosts=Zentity.DNS.Com:9200 --pipeline=benchmark-only --client-options="basic_auth_user:'user',basic_auth_password:'password'"

esrally race --track=/home/ec2-user/tracks/ES-Cluster --target-hosts=,, --pipeline=benchmark-only --client-options="basic_auth_user:'user',basic_auth_password:'password'"
{% import "rally.helpers" as rally with context %}
  "version": 2,
  "description": "Tracker-generated track for ES-Cluster",
  "indices": [
      "name": "aps",
      "body": "aps.json"
  "corpora": [
      "name": "aps",
      "documents": [
          "target-index": "aps",
          "source-file": "aps-documents.json.bz2",
          "document-count": 39695546,
          "compressed-bytes": 2144165370,
          "uncompressed-bytes": 36715007720
  "schedule": [
      "operation": "delete-index"
      "operation": {
        "operation-type": "create-index",
        "settings": {{index_settings | default({}) | tojson}}
      "operation": {
        "operation-type": "cluster-health",
        "index": "aps",
        "request-params": {
          "wait_for_status": "{{cluster_health | default('yellow')}}",
          "wait_for_no_relocating_shards": "true"
        "retry-until-success": true
      "operation": {
        "operation-type": "bulk",
        "bulk-size": {{bulk_size | default(5000)}},
        "ingest-percentage": {{ingest_percentage | default(100)}}
      "clients": {{bulk_indexing_clients | default(8)}}
  "operation": {
    "operation-type": "search",
    "params": {
      "search_request": {
        "index": "aps",
        "body": {
          "query": {
            "match_all": {}
config.version = 17

env.name = local

root.dir = /home/ec2-user/.rally/benchmarks
src.root.dir = /home/ec2-user/.rally/benchmarks/src

remote.repo.url = https://github.com/elastic/elasticsearch.git
elasticsearch.src.subdir = elasticsearch

local.dataset.cache = /home/ec2-user/.rally/benchmarks/data

datastore.type = in-memory
datastore.host =
datastore.port =
datastore.secure = False
datastore.user =
datastore.password =

default.url = https://github.com/elastic/rally-tracks

default.url = https://github.com/elastic/rally-teams

preserve_benchmark_candidate = false

release.cache = true
2023-06-28 18:34:02,354 -not-actor-/PID:9670 esrally.rally INFO OS [uname_result(system='Linux', node='', release='3.10.0-1160.92.1.el7.x86_64', version='#1 SMP Thu May 18 11:23:40 UTC 2023', machine='x86_64', processor='x86_64')]
2023-06-28 18:34:02,355 -not-actor-/PID:9670 esrally.rally INFO Python [namespace(_multiarch='x86_64-linux-gnu', cache_tag='cpython-38', hexversion=50860272, name='cpython', version=sys.version_info(major=3, minor=8, micro=16, releaselevel='final', serial=0))]
2023-06-28 18:34:02,355 -not-actor-/PID:9670 esrally.rally INFO Rally version [2.8.0]
2023-06-28 18:34:02,355 -not-actor-/PID:9670 esrally.utils.net INFO Connecting directly to the Internet (no proxy support) for [all_proxy].
2023-06-28 18:34:02,355 -not-actor-/PID:9670 esrally.utils.net INFO Connecting directly to the Internet (no proxy support) for [all_proxy].
2023-06-28 18:34:02,355 -not-actor-/PID:9670 esrally.rally INFO Cleaning track dependency directory [/home/ec2-user/.rally/libs]...
2023-06-28 18:34:02,385 -not-actor-/PID:9670 esrally.rally INFO Actor system already running locally? [False]
2023-06-28 18:34:02,385 -not-actor-/PID:9670 esrally.actor INFO Starting actor system with system base [multiprocTCPBase] and capabilities [{'coordinator': True, 'ip': '', 'Convention Address.IPv4': ''}].
2023-06-28 18:34:02,526 -not-actor-/PID:9685 root INFO ++++ Actor System gen (3, 10) started, admin @ ActorAddr-(T|:1900)
2023-06-28 18:34:02,536 -not-actor-/PID:9670 esrally.racecontrol INFO Race id is [a2b51c7a-a4ae-474a-a76f-9ac3951e35b3]
2023-06-28 18:34:02,536 -not-actor-/PID:9670 esrally.racecontrol INFO User specified pipeline [benchmark-only].
2023-06-28 18:34:02,537 -not-actor-/PID:9670 esrally.racecontrol INFO Using configured hosts [{'host': 'zentity-xx.xx.xx.local', 'port': 9200}]
2023-06-28 18:34:02,539 ActorAddr-(T|:1900)/PID:9685 esrally.actor DEBUG Capabilities [{'coordinator': True, 'ip': '', 'Convention Address.IPv4': '', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 8, 16, 'final', 0), 'Thespian Generation': (3, 10), 'Thespian Version': '1687977242510'}] match requirements [{'coordinator': True}].
2023-06-28 18:34:02,547 ActorAddr-(T|:41812)/PID:9687 esrally.client.factory INFO Creating ES client connected to [{'host': 'zentity-xx.xx.xx.local', 'port': 9200}] with options [{'timeout': 60, 'basic_auth_user': 'user', 'basic_auth_password': '*****'}]
2023-06-28 18:34:03,114 ActorAddr-(T|:41812)/PID:9687 esrally.racecontrol INFO Automatically derived distribution version [7.17.0]
2023-06-28 18:34:03,897 ActorAddr-(T|:41812)/PID:9687 esrally.utils.process INFO From https://github.com/elastic/rally-tracks
   336c149..64fa238  master     -> origin/master
   336c149..64fa238  8.7        -> origin/8.7

2023-06-28 18:34:03,946 ActorAddr-(T|:41812)/PID:9687 esrally.utils.repo INFO Checking out [7.17] in [/home/ec2-user/.rally/benchmarks/tracks/default] for distribution version [7.17.0].
2023-06-28 18:34:04,147 ActorAddr-(T|:41812)/PID:9687 esrally.utils.process INFO Switched to branch '7.17'
Your branch is up-to-date with 'origin/7.17'.

2023-06-28 18:34:04,148 ActorAddr-(T|:41812)/PID:9687 esrally.utils.repo INFO Rebasing on [7.17] in [/home/ec2-user/.rally/benchmarks/tracks/default] for distribution version [7.17.0].
2023-06-28 18:34:05,397 ActorAddr-(T|:41812)/PID:9687 esrally.utils.process INFO Already on '7.17'
Your branch is up-to-date with 'origin/7.17'.

2023-06-28 18:34:05,849 ActorAddr-(T|:41812)/PID:9687 esrally.utils.process INFO Current branch 7.17 is up to date.

2023-06-28 18:34:05,948 ActorAddr-(T|:43559)/PID:9758 esrally.actor INFO Received signal from race control to start engine.
2023-06-28 18:34:05,902 ActorAddr-(T|:41812)/PID:9687 esrally.track.loader INFO Reading track specification file [/home/ec2-user/tracks/ES-Cluster/track.json].
2023-06-28 18:34:05,949 ActorAddr-(T|:43559)/PID:9758 esrally.actor INFO Cluster will not be provisioned by Rally.
2023-06-28 18:34:05,920 ActorAddr-(T|:41812)/PID:9687 esrally.track.loader INFO Final rendered track for '/home/ec2-user/tracks/ES-Cluster/track.json' has been written to '/tmp/tmp5n6v0rtl.json'.
2023-06-28 18:34:05,932 ActorAddr-(T|:41812)/PID:9687 esrally.track.loader INFO Loading template [definition for index aps in aps.json].
2023-06-28 18:34:05,937 ActorAddr-(T|:41812)/PID:9687 esrally.metrics INFO Creating in-memory metrics store
2023-06-28 18:34:05,937 ActorAddr-(T|:41812)/PID:9687 esrally.metrics INFO Opening metrics store for race timestamp=[20230628T183402Z], track=[/home/ec2-user/tracks/ES-Cluster], challenge=[default], car=[['external']]
2023-06-28 18:34:05,938 ActorAddr-(T|:41812)/PID:9687 esrally.metrics INFO Creating file race store
2023-06-28 18:34:05,938 ActorAddr-(T|:41812)/PID:9687 esrally.actor INFO Asking mechanic to start the engine.
2023-06-28 18:34:05,950 ActorAddr-(T|:41812)/PID:9687 esrally.actor INFO Mechanic has started engine successfully.
2023-06-28 18:34:05,951 ActorAddr-(T|:41812)/PID:9687 esrally.actor INFO Telling driver to prepare for benchmarking.
2023-06-28 18:34:05,958 ActorAddr-(T|:37695)/PID:9759 esrally.metrics INFO Creating in-memory metrics store
2023-06-28 18:34:05,958 ActorAddr-(T|:37695)/PID:9759 esrally.metrics INFO Opening metrics store for race timestamp=[20230628T183402Z], track=[/home/ec2-user/tracks/ES-Cluster], challenge=[default], car=[['external']]
2023-06-28 18:34:05,959 ActorAddr-(T|:37695)/PID:9759 esrally.client.factory INFO Creating ES client connected to [{'host': 'zentity-xx.xx.xx.local', 'port': 9200}] with options [{'timeout': 60, 'basic_auth_user': 'user', 'basic_auth_password': '*****','retry_on_timeout': True}]
2023-06-28 18:34:05,960 ActorAddr-(T|:37695)/PID:9759 esrally.driver.driver INFO Checking if REST API is available.
2023-06-28 18:34:05,967 ActorAddr-(T|:37695)/PID:9759 esrally.driver.driver INFO REST API is available.
2023-06-28 18:34:05,969 ActorAddr-(T|:37695)/PID:9759 esrally.actor INFO Starting prepare track process on hosts [['localhost']]
2023-06-28 18:34:05,975 ActorAddr-(T|:40599)/PID:9760 esrally.actor INFO Track Preparator started
2023-06-28 18:34:06,351 ActorAddr-(T|:40599)/PID:9760 esrally.track.loader INFO Reading track specification file [/home/ec2-user/tracks/ES-Cluster/track.json].
2023-06-28 18:34:06,363 ActorAddr-(T|:40599)/PID:9760 esrally.track.loader INFO Final rendered track for '/home/ec2-user/tracks/ES-Cluster/track.json' has been written to '/tmp/tmppqm07b7b.json'.
2023-06-28 18:34:06,370 ActorAddr-(T|:40599)/PID:9760 esrally.track.loader INFO Loading template [definition for index aps in aps.json].
2023-06-28 18:34:06,377 ActorAddr-(T|:40599)/PID:9760 esrally.actor INFO Preparing track [/home/ec2-user/tracks/ES-Cluster]
2023-06-28 18:34:06,377 ActorAddr-(T|:40599)/PID:9760 esrally.actor INFO Reloading track [/home/ec2-user/tracks/ES-Cluster] to ensure plugins are up-to-date.
2023-06-28 18:34:06,805 ActorAddr-(T|:40599)/PID:9760 esrally.track.loader INFO Reading track specification file [/home/ec2-user/tracks/ES-Cluster/track.json].
2023-06-28 18:34:06,817 ActorAddr-(T|:40599)/PID:9760 esrally.track.loader INFO Final rendered track for '/home/ec2-user/tracks/ES-Cluster/track.json' has been written to '/tmp/tmpud4p29hq.json'.
2023-06-28 18:34:06,824 ActorAddr-(T|:40599)/PID:9760 esrally.track.loader INFO Loading template [definition for index aps in aps.json].
2023-06-28 18:34:17,279 ActorAddr-(T|:44913)/PID:9791 esrally.track.loader INFO Resolved data root directory for document corpus [aps] intrack [/home/ec2-user/tracks/ES-Cluster] to [['/home/ec2-user/.rally/benchmarks/data/aps']].
2023-06-28 18:34:22,284 ActorAddr-(T|:44913)/PID:9791 esrally.driver.driver ERROR Worker failed. Notifying parent...
Traceback (most recent call last):

  File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)

  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/track/loader.py", line 457, in prepare_docs
    preparator.prepare_document_set(document_set, data_root[0])

  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/track/loader.py", line 620, in prepare_document_set
    self.downloader.download(document_set.base_url, target_path, expected_size)

  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/track/loader.py", line 508, in download
    raise exceptions.DataError("Cannot download data because no base URL is provided.")

esrally.exceptions.DataError: Cannot download data because no base URL is provided.

2023-06-28 18:34:22,303 ActorAddr-(T|:37695)/PID:9759 esrally.actor ERROR Main driver received a fatal exception from a load generator. Shutting down.
2023-06-28 18:34:22,303 ActorAddr-(T|:37695)/PID:9759 esrally.metrics INFO Closing metrics store.
2023-06-28 18:34:22,304 ActorAddr-(T|:41812)/PID:9687 esrally.actor INFO Received a benchmark failure from [ActorAddr-(T|:37695)] and will forward it now.
2023-06-28 18:34:22,306 -not-actor-/PID:9670 esrally.racecontrol ERROR A benchmark failure has occurred
2023-06-28 18:34:22,306 -not-actor-/PID:9670 esrally.racecontrol INFO Telling benchmark actor to exit.
2023-06-28 18:34:22,307 ActorAddr-(T|:37695)/PID:9759 esrally.actor INFO Main driver received ActorExitRequest and will terminate all load generators.
2023-06-28 18:34:25,308 -not-actor-/PID:9670 esrally.rally INFO Attempting to shutdown internal actor system.
2023-06-28 18:34:25,310 -not-actor-/PID:9686 root INFO ActorSystem Logging Shutdown
2023-06-28 18:34:25,330 -not-actor-/PID:9685 root INFO ---- Actor System shutdown
2023-06-28 18:34:25,331 -not-actor-/PID:9670 esrally.rally INFO Actor system is still running. Waiting...
2023-06-28 18:34:26,332 -not-actor-/PID:9670 esrally.rally INFO Shutdown completed.
2023-06-28 18:34:26,332 -not-actor-/PID:9670 esrally.rally ERROR Cannot run subcommand [race].
Traceback (most recent call last):
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/rally.py", line 1172, in dispatch_sub_command
    race(cfg, args.kill_running_processes)
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/rally.py", line 920, in race
    with_actor_system(racecontrol.run, cfg)
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/rally.py", line 950, in with_actor_system
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/racecontrol.py", line 372, in run
    raise e
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/racecontrol.py", line 369, in run
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/racecontrol.py", line 71, in __call__
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/racecontrol.py", line 308, in benchmark_only
    return race(cfg, external=True)
  File "/home/ec2-user/.local/lib/python3.8/site-packages/esrally/racecontrol.py", line 266, in race
    raise exceptions.RallyError(result.message, result.cause)
esrally.exceptions.RallyError: Error in task executor

Hello, and thank you for your interest in Rally!

Rally is complaining that it's not finding the benchmark data. Despite the confusing error message, it tries to find benchmark data in three places:

  • In the track directory itself ( /home/ec2-user/tracks/ES-Cluster in your case) but only if you use --track-path=/home/ec2-user/tracks/ES-Cluster instead of --track=/home/ec2-user/tracks/ES-Cluster
  • From the Internet using base-url, which is useful for public tracks and is what the error message is about.
  • From the dataset cache at /home/ec2-user/.rally/benchmarks/data once you already run the track succesfully once.

In your case, I'm assuming that you followed Define Custom Workloads: Tracks - Rally 2.8.0 documentation so that you indeed have aps-documents.json.bz2 under /home/ec2-user/tracks/ES-Cluster. But please make sure to use --track-path instead of --track as the latter is only documented to accept track names, not paths.

Thank you so much for the response @Quentin_Pradet . After correcting the track as track_path in the command , I was able to run the benchmark. The issue can be closed.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.