I found that rally uses Actor model to generate load under the hood. From the documentation, I understand that target throughput is achieved with all clients together. However, I do not understand the way number of clients set in a race and CPU cores are related. Can someone help me understand how these two are related? Basically the way number of clients set in a rally schedule are mapped to Actors and inturn to CPU threads to generate target-throughput.
If there are any documentation that's already there regarding these, please point me to it.
These are largely implementation details, and as a user of Rally you shouldn't really need to be aware of them, but let me try and explain.
Rally is written in Python, which has a mechanism called the GlobalInterpreterLock to prevent multiple threads of the same Python process from running on CPU in parallel.
This 'GIL' limits the total throughput we could achieve using a single Python process, so to work around this we use a model of allocating one 'Actor' per-available core, where an Actor is separate Python process responsible for running a single asyncio event loop.
Then, the number of clients defined in a specific task's
client property are evenly distributed across the available 'Actors' (i.e. Python processes/asyncio event loops).
Thanks Brad. This helps me imagine the flow. One followup question. I read that the latency metrics reported in the summary report includes the time spent by a request waiting in a queue after creation but before sent to the cluster. Where is this queue in the picture? Is this queue part of each client within an Actor or the Actor itself?
This really is an implementation detail, but right now it is per client within an Actor.
If you're interested in understanding this in more detail, take a look at these files for a start:
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.