Hi @xihuanbanku,
Rally issues requests synchronously, so it does wait for a response. We have two benchmarking modes in fact:
- Something that I'd call a "throughput benchmark": Rally issues requests as fast as it can (in a blocking fashion, not async!). This mode is usually used for bulk indexing.
- Throughput-throttled: You can also define a target throughput and Rally aims to achieve that throughput. If Elasticsearch responds faster than that, then Rally will wait. If Elasticsearch takes longer than that, then Rally will issue the next request immediately after. So it will try to match the target throughput as close as it can. In fact, if Rally could not reach the target throughput, you will (a) see high latency numbers (much higher than the service time) and (b) this is a sign that your configuration (hardware + ES config) is at or over peak capacity and you should reduce target throughput.
Daniel