Does esrally cheating?Big difference between the two measurements

I run twice esrally testings(each testing only perform “index-append”), but the result are very different.

1.create the index geonames manually
2.First test:
esrally race --pipeline=benchmark-only --target-hosts= --track=geonames
The result:
|Min Throughput | index-append | 144447 | docs/s |
|Median Throughput | index-append | 209693 | docs/s |
|Max Throughput | index-append | 225118 | docs/s |

3.delete the index geonames manually
4.create the index geonames manually
5.Second test:
esrally race --pipeline=benchmark-only --target-hosts= --track=geonames
The result:
|Min Throughput | index-append | 228746 | docs/s |
|Median Throughput | index-append | 256881 | docs/s |
|Max Throughput | index-append | 263886 | docs/s |

The throughput of the two measurements is very different,about 50000 docs/s.I repeated the testing many times, and this result will appear.

Hi @Liujinan,

no cheating going on :smiley:.

I think the two tests are identical unless you either restart the node or recreate it or something else prior to the first step. Will you please explain what the true starting point for the experiment is?

Also, it is important that the ES is setup for prod use.

Is it is a single node cluster or does it have multiple nodes? Explaining the cluster topology may help here, as well as including the precise "create index manually" step.


@Liujinan You may also find this blog post useful: it outlines a few potential issues that could lead to inconsistent results.

1 Like

My elasticsearch cluster is a single node--,and esrally node is
Before the step 1,I restart the elasticsearch.When I can curl,I execute the next steps.

The meaning of "create index manually":
In the esrally default configuration,the esrally has the following step: "delete-index","create-index","cluster-health","index-append"......
And I delete the step "delete-index","create-index".
I create the index manually on elasticsearch node,then execute the next steps.

Hi @Liujinan,

three more questions:

  1. When you say restart Elasticsearch, do you then restart just the JVM or is there any reboot of host/container or similar going on?
  2. The elasticsearch node and rally node, are they separate physical machines or are there any virtualization/containers involved?
  3. The create index manually step, does that use the exact same settings as rally? Can you share the precise step done here (the full request sent to elasticsearch)?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.