Benchmarck elastic cluster with rally

Hello,

I am going to setup a Prod cluster, I receive 30GB of logs per day (Need to keep the data 1 month in Hot node and 11 months in warm node )

I did some maths, and I am going to use (3 hot nodes and 4 warm nodes ) [64GB of memory in each node)

I have never used rally before, that's why I would like to ask some questions :
1- How can I benchmark this clutser to see how many eps I can get with this cluster configuration ? 2- Should I create the cluster first to be able to benchmark it, or rally can simulate it ?

Thanks in advance,
Best regards

Hi @Abdelhalim,

Have you had a look through the rally Quickstart docs? This runs through installing rally and running a basic benchmark.

We also have some documentation related to..

2- Should I create the cluster first to be able to benchmark it, or rally can simulate it ?

Rally can do either - but in any case you'll first need to have provisioned the underlying infrastructure (i.e. the machines themselves).

See Setting up a Cluster on how to use Rally to provision Elasticsearch for you.

This should be enough to get you started!

Thanks,
Brad

1 Like

Thanks @Bradley_Deam , the documentation helped me to understand how rally works,
I am using the command bellow to test the benchmark :

esrally race --track=http_logs --target-hosts=10.13.81.11:9200,10.13.81.12:9200,10.13.81.13:9200 --pipeline=benchmark-only --client-options='use_ssl:true,verify_cert:true,ca_certs:'ca.crt',client_cert:'beat.crt',client_key:'beat.key',basic_auth_user:'elastic',basic_auth_password:'My_Passwd'' --track-params='number_of_shards:1' --track-params='number_of_replicas:0'

I have just 1 final question :

I would like to benchmark the indexing and searching performance in the same time. for the search for example for 5 users with a targeted throughput of 1000 OPS.

Could you please tell me which options to add to my command ?

Best regards

Sorry for the delay here, the notification of your reply got lost in my inbox :frowning:

I would like to benchmark the indexing and searching performance in the same time. for the search for example for 5 users with a targeted throughput of 1000 OPS.

Could you please tell me which options to add to my command ?

Sure - this sounds like a great fit for a parallel block - see the docs here. The docs also contain an example of concurrent indexing and querying..

In this scenario, we run indexing and a few queries in parallel with a total of 14 clients:



schedule": [
 {
   "parallel": {
     "tasks": [
       {
         "operation": "bulk",
         "warmup-time-period": 120,
         "time-period": 3600,
         "clients": 8,
         "target-throughput": 50
       },
       {
         "operation": "default",
         "clients": 2,
         "warmup-iterations": 50,
         "iterations": 100,
         "target-throughput": 50
       },
       {
         "operation": "term",
         "clients": 2,
         "warmup-iterations": 50,
         "iterations": 100,
         "target-throughput": 200
       },
       {
         "operation": "phrase",
         "clients": 2,
         "warmup-iterations": 50,
         "iterations": 100,
         "target-throughput": 200
       }
     ]
   }
 }

Note that the target-throughput specifies the number of requests per second over all clients, so a target-throughput: 50 with clients: 2 means each client will aim to perform ~25 op/s.