Is there a recommended pipeline to benchmark an existent cluster

We've built an es cluster and indexed our data. How do i benchmark this cluster?

Suppose we already have our tracks, we only want to evaluate search latency and that we know the throughput we'll need.

  1. Do i use the rally daemon with a custom configuration to build another cluster from scratch? And if this is the best option, do i index all, or some part of data, in it?

  2. Or do i use the benchmark-only pipeline to evaluate the existent cluster?

Is there any measure that i can't get with the second option, in regarding with search latency, that is relevant and i should be considering when configuring my cluster?

If you want to benchmark an existing cluster you should use the benchmark-only pipeline. Make sure of course that you don't run Rally on the same machine as any of the Elasticsearch nodes (while at it, also make sure you've familiarized yourself with the basic benchmarking gotchas eloquently depicted in the 7 deadly sins of benchmarking presentation).

If there are issues with scalability of the load driver (highly doubtful, since you want to evaluate search latency rather than indexing throughput) you can consider distributing the load driver across several machines.

1 Like

Thank you, for your response, I've watched the video Benchmarking Elasticsearch with Rally, but didn't know this presentation. We are starting our process of benchmark, in future certainly we'll need to deal with indexing too, thanks for the tips.

Suppose tha we run our track and don't get the desired latency. How will we modify the cluster based on our results? From what i understood reading the docs, with benchamark-only we don't have telemetry devices measures.
In the benchmark-only scenario, will we have to guess which configuration to tweak? Like make a priority list of changes and rerun the benchmark with each one?

For instance:

  1. First increase the heap, and then run the track again
  2. Work on number of shards or size of each one, and then run the track again

Something like this?

If you have a search use case it might be worthwhile looking at the methodology described in this old ElasticON talk. As queries are executed single threaded against each shard, the first step is often to see how search latency depends on shard size. Make sure you vary queries so you do not get all cached. Once you find a good shard size (or range) you set up a cluster and see how many nodes you need in order to handle all your data (generally with one replica). You can now see how much query throughput this small cluster can handle while responding within SLA. Once you have established this you can scale out nodes and replicas to handle larger throughput.

1 Like

Thank you, i was looking for something like this, in case my existing cluster can't give me the latency i need.