How Write throughput is calculated in Rally

This is a complicated topic because Rally can operated in a distributed way and thus needs to consider samples from all load drivers before it calculates global metrics, such as the total throughput.

The algorithm can be found in the ThroughputCalculator class.
Perhaps the best way to explain this is from Rally's own unit tests.

If you check this unit test there is an example assuming two drivers.

samples = [
    driver.Sample(0, 1470838595, 21, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 1, 1 / 9),
    driver.Sample(0, 1470838596, 22, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 2, 2 / 9),
    driver.Sample(0, 1470838597, 23, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 3, 3 / 9),
    driver.Sample(0, 1470838598, 24, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 4, 4 / 9),
    driver.Sample(0, 1470838599, 25, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 5, 5 / 9),
    driver.Sample(0, 1470838600, 26, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 6, 6 / 9),
    driver.Sample(1, 1470838598.5, 24.5, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 4.5, 7 / 9),
    driver.Sample(1, 1470838599.5, 25.5, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 5.5, 8 / 9),
    driver.Sample(1, 1470838600.5, 26.5, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 6.5, 9 / 9)
]

The signature for Sample explains each of those parameters:

def __init__(self, client_id, absolute_time, relative_time, task, sample_type, request_meta_data, latency_ms, service_time_ms,
                 total_ops, total_ops_unit, time_period, percent_completed):

Rally will first sorting the samples by absolute time:

    driver.Sample(0, 1470838595, 21, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 1, 1 / 9),
    driver.Sample(0, 1470838596, 22, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 2, 2 / 9),
    driver.Sample(0, 1470838597, 23, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 3, 3 / 9),
    driver.Sample(0, 1470838598, 24, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 4, 4 / 9),
    driver.Sample(1, 1470838598.5, 24.5, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 4.5, 7 / 9),
    driver.Sample(0, 1470838599, 25, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 5, 5 / 9),
    driver.Sample(1, 1470838599.5, 25.5, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 5.5, 8 / 9),
    driver.Sample(0, 1470838600, 26, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 6, 6 / 9),
    driver.Sample(1, 1470838600.5, 26.5, op, metrics.SampleType.Normal, None, -1, -1, 5000, "docs", 6.5, 9 / 9)

For each of the first four timestamps the calculated throughput is 5000 docs/s because each sample took 1s.

Starting with timestamp 1470838599 though, our calculation involves:

4*5000 (total docs of first four samples) + 5000 (timestamp 1470838598.5) + 5000 (timestamp 1470838599) / 5 = 6000 doc/s

Similarly for timestamp 1470838600 two more samples got collected each referencing 5000 docs/s so the throughput at that point is:

(30000 (total docs at1470838599) + 5000 + 5000) / 6 = 6666.666666666667 docs/s

This process keeps going for all samples and finally the summary output calculates the min/median/max throughput out of this list for normal samples only. Samples collected during the warm up period are not included in the summary report calculation.