To create large data size of as per this suggestion. I created a new custom track folder and set the index count as 5, also copied all the other files & directories from default "nyc_taxis" under this new track folder and ran but get below error. This suggestion was in 2018 is this trick valid in Rally latest 1.4.0 ? Any pointers or suggestions ?
Command ran:
esrally --distribution-version=7.5.2 --target-hosts=192.168.20.4:39200,192.168.20.169:39200 --on-error=abort --track-path=~/.rally/benchmarks/tracks/default/nyc_taxis_many/track.json
error
2020-03-23 21:29:41,826 ActorAddr-(T|:46634)/PID:8306 esrally.utils.modules DEBUG Adding [/home/elastic/.rally/benchmarks/tracks/default] to Python load path.
2020-03-23 21:29:41,826 ActorAddr-(T|:46634)/PID:8306 esrally.utils.modules DEBUG Loading module [nyc_taxis_many.track]
2020-03-23 21:29:41,838 ActorAddr-(T|:46634)/PID:8306 esrally.driver.runner DEBUG Registering runner function [<function wait_for_ml_lookback at 0x7f295ded2bf8>] for [wait-for-ml-lookback].
2020-03-23 21:29:41,839 ActorAddr-(T|:46634)/PID:8306 esrally.actor ERROR Error in track preparator
Traceback (most recent call last):
File "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/esrally/actor.py", line 85, in guard
return f(self, msg, sender)
File "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/esrally/driver/driver.py", line 331, in receiveMsg_PrepareTrack
track.prepare_track(msg.track, cfg)
File "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/esrally/track/loader.py", line 345, in prepare_track
for corpus in used_corpora(t, cfg):
File "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/esrally/track/loader.py", line 327, in used_corpora
param_source = operation_parameters(t, sub_task)
File "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/esrally/track/loader.py", line 318, in operation_parameters
return params.param_source_for_operation(op.type, t, op.params, task.name)
File "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/esrally/track/params.py", line 38, in param_source_for_operation
return __PARAM_SOURCES_BY_OP[op_type](track, params, operation_name=task_name)
File "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/esrally/track/params.py", line 368, in __init__
raise exceptions.InvalidSyntax("'index' is mandatory and is missing for operation '{}'".format(kwargs.get("operation_name")))
esrally.exceptions.InvalidSyntax: ("'index' is mandatory and is missing for operation 'default'", None)
track.json (Updated index count to 5)
{% import "rally.helpers" as rally with context %}
{% set index_count = 5 %}
{
"version": 2,
"description": "Taxi rides in New York in 2015",
"indices": [
{% set comma = joiner() %}
{% for item in range(index_count) %}
{{ comma() }}
{
"name": "nyc_taxis-{{item}}",
"body": "index.json",
"types": [ "type" ],
"auto-managed": false
}
{% endfor %}
],
"corpora": [
{
"name": "nyc_taxis",
"base-url": "http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/nyc_taxis",
"documents": [
{% set comma = joiner() %}
{% for item in range(index_count) %}
{{ comma() }}
{
"target-index": "nyc_taxis-{{item}}",
"target-type": "type",
"source-file": "documents.json.bz2",
"#COMMENT": "ML benchmark rely on the fact that the document count stays constant.",
"document-count": 165346692,
"compressed-bytes": 4812721501,
"uncompressed-bytes": 79802445255
}
{% endfor %}
]
}
],
"operations": [
{{ rally.collect(parts="operations/*.json") }}
],
"challenges": [
{{ rally.collect(parts="challenges/*.json") }}
]
}