I fear Rally is maybe not the right tool for this job as its main purpose is benchmarking and one of the core assumptions is that the system is in a steady state so we can perform reproducible measurements.
What you're after seems more like a QA test ensuring stability of the cluster after it has lost a node. One of the features of the Python client is to use sniffing to learn about the current cluster topology so it will know about disappearing or new cluster nodes. As a corollary of requiring a system in steady state Rally does not enable cluster sniffing. We also do not retry failed requests but only record that a failure has happened as retries will skew measurement results as well.
It would probably make sense that you instead write a small test harness e.g. using the Elasticsearch Python client and enable sniffing and maybe also implement retry handing in case of failed requests. You could still leverage the data sets that we offer with Rally though in order to generate some load. At the end of your test run you'd likely also want to assert that all documents have been ingested successfully.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.