Internal rally queue in throughput-throttled mode

Hi

I have a question regarding latency and ES thread pool queues...I am running rally in throughput-throttled mode, targeting a throughput higher than ES will achieve in benchmarking mode. Some numbers I get are:

| Median Throughput | my-query | 6390.59 | ops/s |
| 50th percentile latency | my-query | 11618.4 | ms |
| 99.99th percentile latency | my-query | 27001.3 | ms |
| 50th percentile service time | my-query | 4.23791 | ms |
| 99.99th percentile service time | my-query | 59.8257 | ms |
| error rate | my-query | 0 | % |

And I can see in monitoring the largest search queue size during the benchmark was just 6.

So far this makes sense, latency being much larger than service time. As per the faq:

"Rally runs in throughput-throttled mode and generates requests according to this schedule regardless of how fast Elasticsearch can respond. In this mode the generated requests are first placed in a queue within Rally and may stay there for some time."

So rally is holding the requests on its own for some time before sending them to ES, thus ES search queue does not grow. My questions are:

  • why is rally doing this? Would not be better that the queue growth is shown in ES (eventually rejecting search requests)?
  • how does rally decide how long to wait in the internal queue? Above it says 'for some time', which sounds fuzzy

thanks

Hi @jmlucjav, Rally will only attempt to run the specified number of operations per second in throughput-mode. Any value, high or low, for the largest Elasticsearch search queue size, is enough to indicate the cluster received more search requests than it could handle at any one time. A low value means the cluster was saturated, but perhaps it was not saturated enough.

target-throughput sets the number of operations per second for the search task. The length of time operations will sit in Rally's queue is variable based on the target-throughput, so we really do not know how long they will be there. They will sit in Rally's queue until enough time has passed for the requests to be submitted at the throughput specified in target-throughput.

If you want to saturate the search queue beyond 6 requests while measuring service time, you could try increasing target-throughput.

Thank you,
Jason

Hi Jason,

Understood.

What I now see it is happening too, is that you need to increase the iterations as well, so rally has time to reach the target throughput...

thanks!