[ERROR] Cannot race. Fatal track or load generator indication Child Aborted

I am trying to put max load on my cluster with distributed load-drivers. I start “benchmark coordinator” on 10.3.75.152

esrally race --pipeline=benchmark-only --target-hosts=x.x.x.x:9200,y.y.y.y:9200,z,z,z,z:9200 --load-driver-hosts=10.127.25.132,10.127.24.64  --track-path=~/parameter_sources

But the rally has an error: Cannot race. Fatal track or load generator indication Child Aborted.

Here is the log:

2018-08-13 10:01:16,556 ActorAddr-(T|:44073)/PID:4405 esrally.actor INFO Capabilities [{'coordinator': True, 'ip': '10.3.75.152', 'Convention Address.IPv4': '10.3.75.152:1900', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 6, 5, 'final', 0), 'Thespian Generation': (3, 9), 'Thespian Version': '1534153011137'}] match requirements [{'coordinator': True}].
2018-08-13 10:01:16,418 ActorAddr-(T|:33857)/PID:4406 esrally.client INFO HTTP compression: off
2018-08-13 10:01:16,419 ActorAddr-(T|:33857)/PID:4406 esrally.client INFO HTTP compression: off
2018-08-13 10:01:16,558 ActorAddr-(T|:44073)/PID:4405 esrally.actor INFO Telling driver to start benchmark.
2018-08-13 10:01:16,585 ActorAddr-(T|:35105)/PID:4408 esrally.metrics INFO Creating in-memory metrics store
2018-08-13 10:01:16,592 ActorAddr-(T|:1900)/PID:4236 esrally.actor INFO Checking capabilities [{'coordinator': True, 'ip': '10.3.75.152', 'Convention Address.IPv4': '10.3.75.152:1900', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 6, 5, 'final', 0), 'Thespian Generation': (3, 9), 'Thespian Version': '1534153011137'}] against requirements [{'ip': '10.127.25.132'}] failed.
2018-08-13 10:01:16,586 ActorAddr-(T|:35105)/PID:4408 esrally.metrics INFO Opening metrics store for trial timestamp=[20180813T100114Z], track=[image_parameter_sources], challenge=[default], car=[['external']]
2018-08-13 10:01:16,595 ActorAddr-(T|:1900)/PID:4236 esrally.actor INFO Checking capabilities [{'coordinator': True, 'ip': '10.3.75.152', 'Convention Address.IPv4': '10.3.75.152:1900', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 6, 5, 'final', 0), 'Thespian Generation': (3, 9), 'Thespian Version': '1534153011137'}] against requirements [{'ip': '10.127.24.64'}] failed.
2018-08-13 10:01:16,587 ActorAddr-(T|:35105)/PID:4408 esrally.actor INFO Checking capabilities [{'coordinator': True, 'ip': '10.3.75.152', 'Convention Address.IPv4': '10.3.75.152:1900', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 6, 5, 'final', 0), 'Thespian Generation': (3, 9), 'Thespian Version': '1534153011137'}] against requirements [{'ip': '10.127.25.132'}] failed.
2018-08-13 10:01:16,605 -not-actor-/PID:4400 esrally.racecontrol ERROR A benchmark failure has occurred
2018-08-13 10:01:16,602 ActorAddr-(T|:44073)/PID:4405 esrally.actor INFO Received a benchmark failure from [ActorAddr-(T|:35105)] and will forward it now.
2018-08-13 10:01:16,605 -not-actor-/PID:4400 esrally.racecontrol INFO Telling benchmark actor to exit.
2018-08-13 10:01:16,606 ActorAddr-(T|:44073)/PID:4405 esrally.actor INFO BenchmarkActor received unknown message [ActorExitRequest] (ignoring).
2018-08-13 10:01:16,607 -not-actor-/PID:4400 esrally.rally ERROR Cannot run subcommand [race].
Traceback (most recent call last):
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/rally.py", line 454, in dispatch_sub_command
    race(cfg)
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/rally.py", line 383, in race
    with_actor_system(lambda c: racecontrol.run(c), cfg)
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/rally.py", line 404, in with_actor_system
    runnable(cfg)
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/rally.py", line 383, in <lambda>
    with_actor_system(lambda c: racecontrol.run(c), cfg)
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/racecontrol.py", line 383, in run
    raise e
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/racecontrol.py", line 380, in run
    pipeline(cfg)
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/racecontrol.py", line 61, in __call__
    self.target(cfg)
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/racecontrol.py", line 327, in benchmark_only
    return race(cfg, external=True)
  File "/home/dr/.local/lib/python3.6/site-packages/esrally/racecontrol.py", line 279, in race
    raise exceptions.RallyError(result.message, result.cause)
esrally.exceptions.RallyError: ('Fatal track or load generator indication', 'Child Aborted')
2018-08-13 10:01:16,589 ActorAddr-(T|:35105)/PID:4408 esrally.actor INFO Checking capabilities [{'coordinator': True, 'ip': '10.3.75.152', 'Convention Address.IPv4': '10.3.75.152:1900', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 6, 5, 'final', 0), 'Thespian Generation': (3, 9), 'Thespian Version': '1534153011137'}] against requirements [{'ip': '10.127.24.64'}] failed.
2018-08-13 10:01:16,595 ActorAddr-(T|:35105)/PID:4408 esrally.driver.driver.DriverActor ERROR Pending Actor create for ActorAddr-(T|:35105) failed (3586): None
2018-08-13 10:01:16,595 ActorAddr-(T|:35105)/PID:4408 esrally.actor INFO A track preparator has exited.
2018-08-13 10:01:16,644 ActorAddr-(T|:44073)/PID:4405 esrally.actor INFO BenchmarkActor received unknown message [ChildActorExited:ActorAddr-(T|:35105)] (ignoring).
2018-08-13 10:01:16,596 ActorAddr-(T|:35105)/PID:4408 esrally.actor ERROR Main driver received a fatal indication from a load generator (Child Aborted). Shutting down.
2018-08-13 10:01:16,600 ActorAddr-(T|:35105)/PID:4408 esrally.metrics INFO Closing metrics store.
2018-08-13 10:01:16,603 ActorAddr-(T|:35105)/PID:4408 esrally.driver.driver.DriverActor ERROR Pending Actor create for ActorAddr-(T|:35105) failed (3586): None
2018-08-13 10:01:16,603 ActorAddr-(T|:35105)/PID:4408 esrally.actor INFO A track preparator has exited.
2018-08-13 10:01:16,604 ActorAddr-(T|:35105)/PID:4408 esrally.actor ERROR Main driver received a fatal indication from a load generator (Child Aborted). Shutting down.
2018-08-13 10:01:16,623 ActorAddr-(T|:35105)/PID:4408 esrally.actor INFO Main driver received ActorExitRequest and will terminate all load generators.
2018-08-13 10:01:16,684 ActorAddr-(T|:44073)/PID:4405 esrally.actor INFO BenchmarkActor received unknown message [ChildActorExited:ActorAddr-(T|:35105)] (ignoring).

Any help is appreciated.

Hello,

I presume you have read and followed the steps about how to run the rally daemon and the page about distributing the load test driver in tips and tricks; if not I strongly recommend to read them carefully.

Since you are using --track-path, have you ensured ~/parameter_sources exists on both load driver nodes?

Can you paste the logs (under ~/.rally/logs) from the load generating nodes (10.127.25.132, 10.127.24.64)?

Dimitris

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.