I'm a bit new to rally and have been recently trying it out. My goal is to start comparing the performance of two elasticsearch clusters I have access to (classic "old one", new "on kubernetes one"). I was hoping to have a rather "quick" test suite and end up having a performance rating which we can use to measure the performances improvements of tweaking some things on the new "on kubernetes cluster".
The useful bit of the documentation for this was :
Running on my laptop I tried a simple docker run --rm -ti --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.10.1 and esrally race --pipeline=benchmark-only --target-hosts=localhost:9200 , this seemed to be a rather long tests, so I tried to dig for a smaller dataset. After some tweaking I got to using :
Thank you for your interest in Rally! In general, each rally track evaluating performance of a particular feature. For example, percolator track is for evaluating the performance of percolation queries. So you might want to pick a track that is close to what the actual workload will be. esrally list tracks shows short description of the tracks and existing challenges. For more detailed description please take a look README files under each track in rally-tracks
We do have --exclude option. You could consider running a smaller subset of queries to speed up the test. Also, for each track there often challenges that only test indexing throughput (have index-only in the name) and that could be faster (if indexing performance is what you are trying to tune for). However, we don't use indexing throughput numbers for percolator and noaa as a metrics.
I understand that there are very specific things to test out using rally and that it is tailored for advanced benchmarks, I might be needing that at some point and it feels like a really good tool for that. Indeed my two "naive" interests in performance would be "what's the performance of indexing" and "what's the performance of simple search" (we used https://locust.io/ at some point for the latter).
Do you think rally could have a track/dataset that gives a quick / summarised answer to a first "impression" (not detailed study) of these metrics, a bit like pgbench PostgreSQL: Documentation: 13: pgbench would for a naive approach to perfomance for postgresql.
This is primarily to have simple ways of communicating with a hosting service or IT department to be able to compare existing and new services or when a cluster characteristic is changed (for example, "hey I've added some RAM, is it working faster?" - "me: launched the rally test, we've gone from X to Y, that's good!")
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.