I am benchmarking a elaticsearch cluster but Rally gets stuck with the following output on the command line.
/ __ ____ / / / __
/ // / __ `/ / / / / /
/ , / // / / / // /
// ||_,///_, /
/____/
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
************** WARNING: A dark dungeon lies ahead of you **************
Rally does not have control over the configuration of the benchmarked
Elasticsearch cluster.
Be aware that results may be misleading due to problems with the setup.
Rally is also not able to gather lots of metrics at all (like CPU usage
of the benchmarked cluster) or may even produce misleading metrics (like
the index size).
****** Use this pipeline only if you are aware of the tradeoffs. ******
*************************** Watch your step! ***************************
[INFO] Racing on track [ilp_track], challenge [index il-search performance data] and car ['external'] with version [5.2.1].
[WARNING] indexing_total_time is 37333 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 11535 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 1146 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Following is the log output:
2018-09-17 19:01:20,933 ActorAddr-(T|:33393)/PID:17984 esrally.actor INFO Telling driver to start benchmark.
2018-09-17 19:01:20,853 ActorAddr-(T|:36524)/PID:17991 esrally.mechanic.launcher INFO REST API is available. Attaching telemetry devices to cluster.
2018-09-17 19:01:20,908 ActorAddr-(T|:36524)/PID:17991 esrally.mechanic.launcher INFO Telemetry devices are now attached to the cluster.
2018-09-17 19:01:20,909 ActorAddr-(T|:36524)/PID:17991 esrally.actor INFO Transitioning from [nodes_started] to [apply_meta_info].
2018-09-17 19:01:20,930 ActorAddr-(T|:36524)/PID:17991 esrally.actor INFO Transitioning from [cluster_started] to [benchmark_starting].
2018-09-17 19:01:20,965 ActorAddr-(T|:36524)/PID:17991 esrally.mechanic.telemetry WARNING indexing_total_time is 37333 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
2018-09-17 19:01:20,966 ActorAddr-(T|:36524)/PID:17991 esrally.mechanic.telemetry WARNING refresh_total_time is 11535 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
2018-09-17 19:01:20,967 ActorAddr-(T|:36524)/PID:17991 esrally.mechanic.telemetry WARNING flush_total_time is 1146 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
2018-09-17 19:01:20,992 ActorAddr-(T|:33393)/PID:17984 esrally.actor INFO BenchmarkActor received unknown message [<esrally.mechanic.mechanic.BenchmarkStarted object at 0x7f3dae344dd8>] (ignoring).
2018-09-17 19:01:21,43 ActorAddr-(T|:40809)/PID:17993 esrally.metrics INFO Creating in-memory metrics store
2018-09-17 19:01:21,45 ActorAddr-(T|:40809)/PID:17993 esrally.metrics INFO Opening metrics store for trial timestamp=[20180917T190119Z], track=[ilp_track], challenge=[index il-search performance data], car=[['external']]
2018-09-17 19:01:21,46 ActorAddr-(T|:40809)/PID:17993 esrally.actor INFO Capabilities [{'Thespian Generation': (3, 9), 'Thespian Version': '1537210880128', 'ip': '127.0.0.1', 'Thespian ActorSystem Version': 2, 'Python Version': (3, 5, 6, 'final', 0), 'coordinator': True, 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Convention Address.IPv4': '127.0.0.1:1900', 'Thespian Watch Supported': True}] match requirements [{'coordinator': True}].
2018-09-17 19:01:21,181 ActorAddr-(T|:40809)/PID:17993 esrally.driver.driver INFO Benchmark is about to start.
2018-09-17 19:01:21,169 ActorAddr-(T|:37553)/PID:17994 esrally.actor INFO Preparing track [ilp_track]
2018-09-17 19:01:21,176 ActorAddr-(T|:37553)/PID:17994 esrally.track.loader INFO Resolved data root directory for document corpus [il-search-performance-data] in track [ilp_track] to ['/home/arane/rally-tracks/ilp_track', '/root/.rally/benchmarks/data/il-search-performance-data'].
Follwoing is the output of /tmp/thespian.log
2018-09-17 17:27:55.671227 p3123 ERR Socket error sending to ActorAddr-(T|:38743) on <socket.socket fd=10, family=AddressFamily.AF_INET, type=2049, proto=6, laddr=('10.34.33.113', 41456)>: [Errno 110] Connection timed out / 110: ************* TransportIntent(ActorAddr-(T|:38743)-pending-ExpiresIn_0:02:52.671847-<class 'thespian.system.messages.multiproc.EndpointConnected'>-<thespian.system.messages.multiproc.EndpointConnected object at 0x7fd627667710>-quit_0:02:52.671733)
2018-09-17 17:27:55.671849 p3134 ERR Socket error sending to ActorAddr-(T|:38743) on <socket.socket fd=10, family=AddressFamily.AF_INET, type=2049, proto=6, laddr=('10.34.33.113', 41466)>: [Errno 110] Connection timed out / 110: ************* TransportIntent(ActorAddr-(T|:38743)-pending-ExpiresIn_0:02:52.739763-<class 'thespian.system.messages.multiproc.EndpointConnected'>-<thespian.system.messages.multiproc.EndpointConnected object at 0x7fd62766a9e8>-quit_0:02:52.739662)
2018-09-17 17:27:55.673116 p3124 ERR Socket error sending to ActorAddr-(T|:38743) on <socket.socket fd=10, family=AddressFamily.AF_INET, type=2049, proto=6, laddr=('10.34.33.113', 41458)>: [Errno 110] Connection timed out / 110: ************* TransportIntent(ActorAddr-(T|:38743)-pending-ExpiresIn_0:02:52.682988-<class 'thespian.system.messages.multiproc.EndpointConnected'>-<thespian.system.messages.multiproc.EndpointConnected object at 0x7fd627667ac8>-quit_0:02:52.682855)