Could not clone from 'https://github.com/elastic/rally-tracks'

Hi @xihuanbanku,

Every Java application should be properly warmed up before you start measuring. "Warmup" just means that you give the application some time right after startup and don't consider the samples that you took during that time. So it's just about how you treat the samples you take. Rally has implemented two flavors of warmup:

  • Based on an iteration count: You can tell Rally in the track specification with warmup-iterations how often you want to run an operation but don't consider the results in reporting. This executes the operation but labels it differently. We use this warmup mode for queries.
  • Based on a warmup time period: This tells Rally to wait for a specific time period until it switches from warmup mode to measurement mode (parameter warmup-time-period(in seconds)). Sure, this is machine-dependent and basically geared towards our nightly benchmarking system but it's a pragmatic start. We use this warmup mode for bulk indexing.

You can see an example walk through in the documentation: Define Custom Workloads: Tracks - Rally 2.10.0.dev0 documentation

The core implementation is in driver.py but I admit I fear it's pretty hard to understand without explanation. So here's an attempt (I just explain the bulk indexing implementation):

First of all, Rally starts N client processes where N is the number of clients you specify in track.json (which is 8 if you didn't change it). I use processes as multi-threading in Python is not well suited here. You can use 8 indexing threads in Java (you seem to use only one thread and I guess this is the main difference).

Each client process reads bulks of 5000 documents from the file and issues one bulk request (so that is already a difference, you use a bulk size of 10.000 if I understood you correctly).

Bulk requests are sent via the Python Elasticsearch client which uses the HTTP protocol (and I guess you use the transport client in Java, so this is another difference).

W.r.t to throughput measurement: This is a throughput benchmark, so I just take the time stamp when Rally issues the request and another one when I receive the response and take the difference. The throughput per client is then bulk size / (tstop - tstart). You can find the benchmarking loop in execute_schedule().

The tricky part is that you are interested in the throughput over all clients and not just per client. That means that you need to aggregate the individual throughput measurements and this can be quite tricky (I fixed several bugs in this area). You can find the implementation of this aggregation in the function calculate_global_throughput() which calculates the throughput in documents per second.

I hope that helps you understand better how Rally measures throughput.

We have some Java based benchmarks but they were intended for benchmarking the transport client against the REST client. You can see the benchmarking code in the Elasticsearch repository. I also wrote a blog post about the benchmark in which you might be interested in.

Daniel