Hi there,
i have an already provisioned elasticsearch setup with an Index that consist of around 30Million documents. I would like to benchmark this existing setup using esrally. In order to do so esrally provides the benchmark-only pipeline. But i would like to use my own custom searches with term aggregations, significant_term aggregation and so on to benchmark my setup. How can this be achieved? Is it possible to customize the benchmark-only pipeline? Or do i have to setup a custom track? If a custom track is needed - the docs say that the corpora element is mandatory. But i don't need a custom corpora, because i do not want to index any documents?
Benchmark-only is a mode, not a track, so you will need to create a custom track which describes the operations and workload. Unfortunately I think a corpora is still mandatory, even though I would expect running custom queries against existing data to be a quite common use case that does not require a corpora. Rally originated as a tool for regression performance testing and that generally always involved benchmarking (and often also setting up and provisioning) against an empty cluster. For this use case it makes sense to make a data set mandatory.
Now that Rally is used for a wider range of use-cases I think it would make sense to relax these constraints and only require a corpora if there are operations that rely on one. I would recommend creating an enhancement requuest.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.