Rally max execution time and timeseries throughput metric

Hello,

I have been looking into the docs but have not been able to find information on two potential options I want from Rally.

First, does rally have a max execution time parameter? Some of these tests run for a long time, so I was wondering if it is possible to limit the duration with time? Or can it be scaled down by the number of tasks?

Second, how can I get time-series info about throughput? Specifially I want to see timeseries of k (ms) intervals of the throughput at that moment. Is this possible? If not, is Kibana throughput as accurate as rally?

Hi Omid,

First, does rally have a max execution time parameter? Some of these tests run for a long time, so I was wondering if it is possible to limit the duration with time?

There's no limit for the overall race (benchmark) duration. This would actually make little sense for non-trivial races which have certain structure, and aborting in the middle would leave you with partial results.

Or can it be scaled down by the number of tasks?

Yes, somewhat.

Let me start from some background first. A race consists of a sequence of tasks as specified by a track challenge and track parameters. Each task can be time limited (see time-period property in schedule documentation), so you could create a custom track/challenge which is time limited if all your tasks are time limited or take very little time to execute. However, this is not how public challenges from Rally tracks are typically written.

The typical structure of a challenge is the following sequence of tasks:

  • delete test index (or indices) and all accompanying setup (e.g. index templates),
  • create test index (or indices) and all accompanying setup (e.g. index templates),
  • index entire corpus using a specific number of indexing clients,
  • force merge and refresh,
  • perform a sequence of search-related tasks, each specified through a number of iterations, and often a target throughput.

None of these tasks are time-limited (there are client-level timeouts when accessing Elasticsearch, but that's a slightly different topic). Initial setup typically takes little time, and is negligible. Indexing will take as long as needed to process entire corpus. If the target cluster has little resources, this may take longer than with a cluster with more resources. Force merge also takes noticeable amount of time.

Even search tasks that specify the number of iterations and target throughput (e.g. 10 searches per second) are not guaranteed to complete in specified amount of time. If the target cluster can meet the target throughput, the time to execute the task will be (warmup-iterations + iterations) / target-throughput. For instance, if there are 500 warm-up iterations, followed by 1000 regular iterations, with a target throughput of 100 requests/s, the overall time will be 15 seconds. However, if the target cluster cannot meet the target throughput Rally will report high latency, but will still go through all the iterations. For instance, if in the above example the throughput drops for whatever reason to 1 request/s, the task will take not 15 seconds but 1500 seconds.

This doesn't mean nothing can be done. With existing tracks, you can experiment with the following set of options:

  • choose a challenge that matches your needs, e.g. if you're only interested in indexing throughput, there's no point doing any search (e.g. geonames default challenge is append-no-conflicts but there's also append-no-conflicts-index-only challenge which does only the indexing bit),
  • reduce corpus size using a track parameter (see README of each track to find the right parameters and their defaults, e.g. ingest_percentage in geonames), this is typically the quickest win, e.g. if it takes 2h to ingest the corpus, ingest percentage set to 50% should reduce that to 1h,
  • skip tasks that are not important using either include-tasks or exclude-tasks command-line options - exclusion is safer as it's less likely to skip a task that's essential in getting reasonable results or results at all,
  • tune track parameters to achieve the highest indexing throughput possible in your test cluster as this will reduce the indexing time - this typically involves increasing the number of shards and bulk indexing clients (e.g. number_of_shards and bulk_indexing_clients in geonames).

Second, how can I get time-series info about throughput? Specifially I want to see timeseries of k (ms) intervals of the throughput at that moment. Is this possible?

Yes. The proper way is to store metrics in Elasticsearch cluster (other than the one that you're testing) with the following rally.ini bit:

[reporting]
datastore.type = elasticsearch
datastore.host = <host-name>
datastore.port = 9200
datastore.secure = true
datastore.user = <user-name>
datastore.password = <password>

Rally creates 3 indices - rally-races-*, rally-results-* and rally-metrics-*. The rally-results contains the same data as the one reported in tabular form at the end of the race, while rally-metrics contains all raw data collected by Rally which includes throughput measurements. Throughput measurements are meaningful for any task that takes some time to complete such as bulk indexing. Throughput is measured with 1-second intervals. For example, if your indexing took 1 hour to complete, you would get around 3600 data points in a series. This allows you to inspect how throughput has been changing throughout the race instead of looking at statistics from the entire run (like median, max, or min).

To filter documents in rally-metrics-* you can use fields such as:

  • race-id (race ID as reported by Rally at the beginning of the race) ,
  • name (type of measurement, e.g. throughput, latency, service_time),
  • task (name of the task as specified in a track challenge),
  • sample-type (warm-up vs. normal).

For instance, if I wanted to see throughput samples from indexing in geonames track from a specific race, I would use the following KQL (Kibana Query Language) expression:

race-id: "<race-id>" and task: "index-append" and name: "throughput"

If not, is Kibana throughput as accurate as rally?

I don't know what you mean by "Kibana throughput". Kibana is definitely useful in presenting the data collected by Rally as described in the previous section.

Please let me know if anything unclear.

Thanks.

Hi Grzegorz,

Thank you so much for your detailed response! I had some follow-up questions. I also wanted to provide some background on the task I hope to accomplish so you understand my needs better. I'm a student and I'm researching Elasticsearch. Specifically, I am intentionally slowing network/disk, and I would like to observe the changes in elastic's performance with respect to metrics such as throughput and latency. To achieve this, I have set up a 3-node cluster of elastic (8.12.0) independently from Rally by using docker-compose. I am able to throttle network/disk for the containers, and by running a race from Rally, I hope to be able to observe the change in throughput as a time series as the race is running. Following your response:

There's no limit for the overall race (benchmark) duration. This would actually make little sense for non-trivial races which have certain structure, and aborting in the middle would leave you with partial results.

I have realized a test such as geonames takes a long time to complete (~1.5 hr). I suspect a reason for this is perhaps using the benchmark-only pipeline from Rally. I know that through Rally, I can also create an Elasticsearch cluster of docker containers. However, for the purpose of my testings, I would need to locally bind the index directory of the elasticsearch node to a directory (through volume mounts). Is it possible to configure using Rally to set up the cluster? So specifically, I want Rally to set up 3 docker containers with mounted volumes for indices.

I don't know what you mean by "Kibana throughput". Kibana is definitely useful in presenting the data collected by Rally as described in the previous section.
Apologies for being vague. What I meant is that, as I understand, Kibana also provides metrics for the throughput of the system. Will the numbers from Kibana match those of Rally?

I really appreciate your detailed response. I don't have more questions at the moment, but can I follow up here with more questions as I work on generating the throughput time series or should I open a new post? Thanks!

Thank you for the additional context, your objective sounds interesting!

I have realized a test such as geonames takes a long time to complete (~1.5 hr). I suspect a reason for this is perhaps using the benchmark-only pipeline from Rally.

1.5h with which challenge? If that's the default challenge append-no-conflicts then it doesn't strike me as long.

I know that through Rally, I can also create an Elasticsearch cluster of docker containers.

There's an option to start a single-node Elasticsearch cluster running in a Docker container, with pre-determined volume mounts. That's an undocumented feature, so you should treat it as unsupported / internal use only. Based on your description, it does not match your needs. I would suggest to stick with benchmark-only pipeline and prepare all the Elasticsearch setup outside of Rally.

Rally does not apply any magic to Elasticsearch configuration that would make your setup faster. There is no getting away from understanding Elasticsearch and making sure your setup makes sense. For instance, with 3 nodes you may want to use 3 shards (or multiple of it) in your test index and make sure each shard lands on a different Elasticsearch node, for maximum indexing throughput.

Apologies for being vague. What I meant is that, as I understand, Kibana also provides metrics for the throughput of the system. Will the numbers from Kibana match those of Rally?

Kibana is an interface for accessing data stored in Elasticsearch. It doesn't generate the data itself, hence my earlier question. I'm guessing you might be talking about Stack Monitoring feature. This data is collected either by Elasticsearch cluster itself, or by Metricbeat / Elastic Agent. When it comes to overall indexing throughput you should see similar results in Rally and in Stack Monitoring after selecting the right time interval. Search latencies / service times as perceived by Elasticsearch client can only be retrieved from Rally.

can I follow up here with more questions as I work on generating the throughput time series or should I open a new post?

You can use this thread unless you have a completely unrelated question which will be easier to find later with a dedicated topic.

Hello,

I've encountered an issue when I have a second Elasticsearch cluster as you mentioned, for gathering metrics, but I could not find documentation on this issue.

The issue is for my first elasticsearch docker cluster (through docker-compose), I mount a local data path to the data directory of the cluster (one node for example here):

es01:
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    container_name: es01
    user: "${UID}:${GID}"
    group_add:
        - "0"
    environment:
      node.name: es01
      cluster.name: ${CLUSTER_NAME}
      cluster.initial_master_nodes: es01,es02,es03
      xpack.security.enabled: false
      discovery.seed_hosts: es01,es03
      ES_JAVA_OPTS: -Xms512m -Xmx512m
      bootstrap.memory_lock: true
    volumes:
      - ./es01/data:/usr/share/elasticsearch/data/
      - ./es01/logs:/usr/share/elasticsearch/logs/
    ports:
      - 9200:9200
      - 9300:9300
    ulimits:
      memlock:
        soft: -1
        hard: -1
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost:9200 || exit 1"]
      interval: 5s
      timeout: 5s
      retries: 100
    networks:
      - elastic_network

Now, when I create a second cluster for gathering metrics, I want to bind the indices directory to a local path so it is persistent and I can store it for later analysis. So, I attempted to create nodes of the second cluster as such:

m_es01:
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    container_name: m_es01
    user: "${UID}:${GID}"
    group_add:
        - "0"
    environment:
      node.name: m_es01
      cluster.name: metrics
      cluster.initial_master_nodes: m_es01,m_es02,m_es03
      xpack.security.enabled: false
      discovery.seed_hosts: m_es01,m_es03
      ES_JAVA_OPTS: -Xms512m -Xmx512m
      bootstrap.memory_lock: true
    volumes:
      - ./m_es01/:/usr/share/elasticsearch/data/
    ports:
      - 8400:8400
      - 8300:8300
    ulimits:
      memlock:
        soft: -1
        hard: -1
    networks:
      - elastic_network

However, I get the following error which seems like both nodes are trying to mount to the same directory ("failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?") which was confusing to me since I assumed these data paths are independent to each cluster:

{"@timestamp":"2024-03-23T20:11:30.918Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"m_es01","elasticsearch.cluster.name":"metrics","error.type":"java.lang.IllegalStateException","error.message":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:296)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.node.NodeConstruction.validateSettings(NodeConstruction.java:484)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.node.NodeConstruction.prepareConstruction(NodeConstruction.java:246)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.node.Node.<init>(Node.java:181)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:236)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:236)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:73)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:241)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:209)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:288)\n\t... 6 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:834)\n\tat org.apache.lucene.core@9.9.2/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)\n\tat org.apache.lucene.core@9.9.2/org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:43)\n\tat org.apache.lucene.core@9.9.2/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)\n\tat org.elasticsearch.server@8.12.1/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:234)\n\t... 8 more\n\tSuppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\t\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\t\tat java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:261)\n\t\tat java.base/java.nio.file.Files.newByteChannel(Files.java:379)\n\t\tat java.base/java.nio.file.Files.createFile(Files.java:657)\n\t\tat org.apache.lucene.core@9.9.2/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)\n\t\t... 11 more\n"}

I would appreciate it if you have any hints about resolving this, thank you!