ES rally benchmarking for existing cluster on AWS

I want to benchmark Elastic cluster which was on AWS with ES Rally on local machine.
I got the below error when tried with following benchmarking command.
Input:
esrally --track=data --target-hosts= [cluster public ip]:9200 --pipeline=benchmark-only --offline
Output:
'''
usage: esrally [-h] [--version] [--advanced-config] [--assume-defaults]
[--distribution-version DISTRIBUTION_VERSION]
[--runtime-jdk RUNTIME_JDK]
[--track-repository TRACK_REPOSITORY | --track-path TRACK_PATH]
[--track-revision TRACK_REVISION]
[--team-repository TEAM_REPOSITORY]
[--team-revision TEAM_REVISION] [--offline]
[--race-id RACE_ID] [--pipeline PIPELINE]
[--revision REVISION] [--track TRACK]
[--track-params TRACK_PARAMS] [--challenge CHALLENGE]
[--team-path TEAM_PATH] [--car CAR] [--car-params CAR_PARAMS]
[--elasticsearch-plugins ELASTICSEARCH_PLUGINS]
[--plugin-params PLUGIN_PARAMS] [--target-hosts TARGET_HOSTS]
[--load-driver-hosts LOAD_DRIVER_HOSTS]
[--client-options CLIENT_OPTIONS]
[--on-error {continue,abort}] [--telemetry TELEMETRY]
[--telemetry-params TELEMETRY_PARAMS]
[--distribution-repository DISTRIBUTION_REPOSITORY]
[--include-tasks INCLUDE_TASKS | --exclude-tasks EXCLUDE_TASKS]
[--user-tag USER_TAG] [--report-format {markdown,csv}]
[--show-in-report {available,all-percentiles,all}]
[--report-file REPORT_FILE] [--preserve-install] [--test-mode]
[--enable-driver-profiling] [--quiet]
[--kill-running-processes]
{race,race-async,list,info,create-track,generate,compare,configure,download,install,start,stop}
...
esrally: error: argument subcommand: invalid choice: '[cluster public ip]:9200' (choose from 'race', 'race-async', 'list', 'info', 'create-track', 'generate', 'compare', 'configure', 'download', 'install', 'start', 'stop')
'''

The last line states the problem, that you are missing a subcommand. Did you copy that command from the documentation, so we might need to fix it..

"esrally --track=data --target-hosts= [cluster public ip]:9200 --pipeline=benchmark-only --offline"
This is the command given as input
can you please answer my query.

  1. Is it possible to benchmark a cluster which was on AWS from the ES rally on local machine?
    If yes can please provide the exact pattern for benchmarking the cluster.

Hi,

you have a space character between the command line parameter name and the value for --target-hosts. I can reproduce this with (note the space character):

esrally --target-hosts= 127.0.0.1:9200 --pipeline=benchmark-only
usage: esrally [-h] [--version] [--advanced-config] [--assume-defaults] [--distribution-version DISTRIBUTION_VERSION] [--runtime-jdk RUNTIME_JDK]
               [--track-repository TRACK_REPOSITORY | --track-path TRACK_PATH] [--track-revision TRACK_REVISION] [--team-repository TEAM_REPOSITORY]
               [--team-revision TEAM_REVISION] [--offline] [--race-id RACE_ID] [--pipeline PIPELINE] [--revision REVISION] [--track TRACK] [--track-params TRACK_PARAMS]
               [--challenge CHALLENGE] [--team-path TEAM_PATH] [--car CAR] [--car-params CAR_PARAMS] [--elasticsearch-plugins ELASTICSEARCH_PLUGINS]
               [--plugin-params PLUGIN_PARAMS] [--target-hosts TARGET_HOSTS] [--load-driver-hosts LOAD_DRIVER_HOSTS] [--client-options CLIENT_OPTIONS]
               [--on-error {continue,continue-on-non-fatal,abort}] [--telemetry TELEMETRY] [--telemetry-params TELEMETRY_PARAMS]
               [--distribution-repository DISTRIBUTION_REPOSITORY] [--include-tasks INCLUDE_TASKS | --exclude-tasks EXCLUDE_TASKS] [--user-tag USER_TAG]
               [--report-format {markdown,csv}] [--show-in-report {available,all-percentiles,all}] [--report-file REPORT_FILE] [--preserve-install] [--test-mode]
               [--enable-driver-profiling] [--quiet] [--kill-running-processes]
               {race,list,info,create-track,generate,compare,configure,download,install,start,stop} ...
esrally: error: argument subcommand: invalid choice: '127.0.0.1:9200' (choose from 'race', 'list', 'info', 'create-track', 'generate', 'compare', 'configure', 'download', 'install', 'start', 'stop')

So please remove the space character and you should be fine. A couple of related notes:

  • I assume that you do not benchmark over the Internet but rather have a separate EC2 instance deployed that is in the same region as your Elasticsearch cluster. Otherwise it is likely you get bogus results because you'll saturate the uplink of your Internet connection instead of Elasticsearch.
  • If that cluster is publicly reachable I suggest to turn on TLS encryption and setup basic authentication.

For tips how to avoid common benchmarking pitfalls you can check our blog post Seven Tips for Better Elasticsearch Benchmarks.

Happy benchmarking!

Daniel

1 Like

Thanks @danielmitterdorfer This helped me.