I have a question regarding latency and ES thread pool queues...I am running rally in throughput-throttled mode, targeting a throughput higher than ES will achieve in benchmarking mode. Some numbers I get are:
| Median Throughput | my-query | 6390.59 | ops/s |
| 50th percentile latency | my-query | 11618.4 | ms |
| 99.99th percentile latency | my-query | 27001.3 | ms |
| 50th percentile service time | my-query | 4.23791 | ms |
| 99.99th percentile service time | my-query | 59.8257 | ms |
| error rate | my-query | 0 | % |
And I can see in monitoring the largest search queue size during the benchmark was just 6.
So far this makes sense, latency being much larger than service time. As per the faq:
"Rally runs in throughput-throttled mode and generates requests according to this schedule regardless of how fast Elasticsearch can respond. In this mode the generated requests are first placed in a queue within Rally and may stay there for some time."
So rally is holding the requests on its own for some time before sending them to ES, thus ES search queue does not grow. My questions are:
- why is rally doing this? Would not be better that the queue growth is shown in ES (eventually rejecting search requests)?
- how does rally decide how long to wait in the internal queue? Above it says 'for some time', which sounds fuzzy