I found that rally uses Actor model to generate load under the hood. From the documentation, I understand that target throughput is achieved with all clients together. However, I do not understand the way number of clients set in a race and CPU cores are related. Can someone help me understand how these two are related? Basically the way number of clients set in a rally schedule are mapped to Actors and inturn to CPU threads to generate target-throughput.
If there are any documentation that's already there regarding these, please point me to it.
These are largely implementation details, and as a user of Rally you shouldn't really need to be aware of them, but let me try and explain.
Rally is written in Python, which has a mechanism called the GlobalInterpreterLock to prevent multiple threads of the same Python process from running on CPU in parallel.
This 'GIL' limits the total throughput we could achieve using a single Python process, so to work around this we use a model of allocating one 'Actor' per-available core, where an Actor is separate Python process responsible for running a single asyncio event loop.
Thanks Brad. This helps me imagine the flow. One followup question. I read that the latency metrics reported in the summary report includes the time spent by a request waiting in a queue after creation but before sent to the cluster. Where is this queue in the picture? Is this queue part of each client within an Actor or the Actor itself?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.