Hi Omid,
First, does rally have a max execution time parameter? Some of these tests run for a long time, so I was wondering if it is possible to limit the duration with time?
There's no limit for the overall race (benchmark) duration. This would actually make little sense for non-trivial races which have certain structure, and aborting in the middle would leave you with partial results.
Or can it be scaled down by the number of tasks?
Yes, somewhat.
Let me start from some background first. A race consists of a sequence of tasks as specified by a track challenge and track parameters. Each task can be time limited (see time-period
property in schedule documentation), so you could create a custom track/challenge which is time limited if all your tasks are time limited or take very little time to execute. However, this is not how public challenges from Rally tracks are typically written.
The typical structure of a challenge is the following sequence of tasks:
- delete test index (or indices) and all accompanying setup (e.g. index templates),
- create test index (or indices) and all accompanying setup (e.g. index templates),
- index entire corpus using a specific number of indexing clients,
- force merge and refresh,
- perform a sequence of search-related tasks, each specified through a number of iterations, and often a target throughput.
None of these tasks are time-limited (there are client-level timeouts when accessing Elasticsearch, but that's a slightly different topic). Initial setup typically takes little time, and is negligible. Indexing will take as long as needed to process entire corpus. If the target cluster has little resources, this may take longer than with a cluster with more resources. Force merge also takes noticeable amount of time.
Even search tasks that specify the number of iterations and target throughput (e.g. 10 searches per second) are not guaranteed to complete in specified amount of time. If the target cluster can meet the target throughput, the time to execute the task will be (warmup-iterations + iterations) / target-throughput
. For instance, if there are 500 warm-up iterations, followed by 1000 regular iterations, with a target throughput of 100 requests/s, the overall time will be 15 seconds. However, if the target cluster cannot meet the target throughput Rally will report high latency, but will still go through all the iterations. For instance, if in the above example the throughput drops for whatever reason to 1 request/s, the task will take not 15 seconds but 1500 seconds.
This doesn't mean nothing can be done. With existing tracks, you can experiment with the following set of options:
- choose a challenge that matches your needs, e.g. if you're only interested in indexing throughput, there's no point doing any search (e.g.
geonames
default challenge is append-no-conflicts
but there's also append-no-conflicts-index-only
challenge which does only the indexing bit),
- reduce corpus size using a track parameter (see README of each track to find the right parameters and their defaults, e.g.
ingest_percentage
in geonames), this is typically the quickest win, e.g. if it takes 2h to ingest the corpus, ingest percentage set to 50% should reduce that to 1h,
- skip tasks that are not important using either include-tasks or exclude-tasks command-line options - exclusion is safer as it's less likely to skip a task that's essential in getting reasonable results or results at all,
- tune track parameters to achieve the highest indexing throughput possible in your test cluster as this will reduce the indexing time - this typically involves increasing the number of shards and bulk indexing clients (e.g.
number_of_shards
and bulk_indexing_clients
in geonames).
Second, how can I get time-series info about throughput? Specifially I want to see timeseries of k (ms) intervals of the throughput at that moment. Is this possible?
Yes. The proper way is to store metrics in Elasticsearch cluster (other than the one that you're testing) with the following rally.ini
bit:
[reporting]
datastore.type = elasticsearch
datastore.host = <host-name>
datastore.port = 9200
datastore.secure = true
datastore.user = <user-name>
datastore.password = <password>
Rally creates 3 indices - rally-races-*
, rally-results-*
and rally-metrics-*
. The rally-results
contains the same data as the one reported in tabular form at the end of the race, while rally-metrics
contains all raw data collected by Rally which includes throughput measurements. Throughput measurements are meaningful for any task that takes some time to complete such as bulk indexing. Throughput is measured with 1-second intervals. For example, if your indexing took 1 hour to complete, you would get around 3600 data points in a series. This allows you to inspect how throughput has been changing throughout the race instead of looking at statistics from the entire run (like median, max, or min).
To filter documents in rally-metrics-*
you can use fields such as:
race-id
(race ID as reported by Rally at the beginning of the race) ,
name
(type of measurement, e.g. throughput
, latency
, service_time
),
task
(name of the task as specified in a track challenge),
sample-type
(warm-up vs. normal).
For instance, if I wanted to see throughput samples from indexing in geonames
track from a specific race, I would use the following KQL (Kibana Query Language) expression:
race-id: "<race-id>" and task: "index-append" and name: "throughput"
If not, is Kibana throughput as accurate as rally?
I don't know what you mean by "Kibana throughput". Kibana is definitely useful in presenting the data collected by Rally as described in the previous section.
Please let me know if anything unclear.
Thanks.