Best practices for testing against clusters on ECE

jan.stap · May 2, 2023, 3:43pm

Hello,

We intend to test the ingest and query performance of clusters hosted on ECE by using Rally. So far, the elastic/logs track looks interesting. I studied this video and pdf on Rally testing pitfalls by the Rally author Daniel Mitterdorfer. The last topic in there stresses the importance of repeating your test many (>30) times, in order to be able to base conclusions on the results.

That triggers these questions:

How to get your to-be-tested cluster on ECE in the same initial state before each test run, so to be able to compare apples-to-apples? I guess you will need to delete and re-create your cluster each time and then optionally load a snapshot with the initial test data. Delete+create cluster and load snapshot could be done via the respective API, allowing automated repeated testing.
How does ECE itself behave: will it have the same performance after deleting and creating a cluster say 100 times? Or will there be a build-up of discarded data? We use ECE version 3.3.0 (I realize this post covers both Rally and ECE functionality).

Best practices in general on using Rally against ECE-hosted clusters are very welcome too.

Thanks!
Jan Stap

json · May 2, 2023, 8:22pm

Hi Jan,

Thank you for your post.

How to get your to-be-tested cluster on ECE in the same initial state before each test run, so to be able to compare apples-to-apples? I guess you will need to delete and re-create your cluster each time and then optionally load a snapshot with the initial test data.

We recommend benchmarking to a new cluster with each benchmark run. It is how the ES nightly benchmarks are done.

How does ECE itself behave: will it have the same performance after deleting and creating a cluster say 100 times? Or will there be a build-up of discarded data? We use ECE version 3.3.0 (I realize this post covers both Rally and ECE functionality).

I would not expect ECE to behave differently after 5, 10, or 1000 clusters. When a cluster deployment is deleted, so is its data. When doing this type of benchmarking, we tend to use Elasticsearch node sizes large enough to consume an entire allocator, but it is not required.

The elastic/logs track is a great track for benchmarking. For indexing, you will want to mind bulk_size and bulk_indexing_clients to not overwhelm your deployments. And if you are satisfied with your indexing benchmarks, you can certainly create a new deployment, restore the snapshot, then run loggging-querying without the bulk-index and compression-stats tasks to run just the query workflows.

In ECE, container cgroup CPU time is scheduled using the Completely Fair Scheduler (CFS). In case you haven't seen it, take a look at Manage your installation capacity | Elastic Cloud Enterprise Reference [3.6] | Elastic to see how the CPU quota is calculated.

Thank you,
Jason

jan.stap · May 3, 2023, 2:40pm

Hi Jason,

Thanks for your detailed answer! And thanks for pointing out on the ECE CPU scheduler; I have seen it before, but I'm now better aware of it.

Best regards,
Jan

system · May 31, 2023, 2:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rally for aggregations on existing ES cluster Elasticsearch rally	7	1103	September 19, 2019
Benchmarck elastic cluster with rally Elasticsearch rally	4	719	August 29, 2021
Is there a recommended pipeline to benchmark an existent cluster Elasticsearch rally	5	558	June 23, 2020
Can elastic/rally point to existing ES configurations for benchmarking Elasticsearch rally	10	3717	January 10, 2017
Benchmark for existing cluster Elasticsearch rally	4	3805	August 15, 2017

Best practices for testing against clusters on ECE

Related topics