Hello,
I am going to setup a Prod cluster, I receive 30GB of logs per day (Need to keep the data 1 month in Hot node and 11 months in warm node )
I did some maths, and I am going to use (3 hot nodes and 4 warm nodes ) [64GB of memory in each node)
I have never used rally before, that's why I would like to ask some questions :
1- How can I benchmark this clutser to see how many eps I can get with this cluster configuration ? 2- Should I create the cluster first to be able to benchmark it, or rally can simulate it ?
Thanks in advance,
Best regards
Hi @Abdelhalim,
Have you had a look through the rally Quickstart docs? This runs through installing rally and running a basic benchmark.
We also have some documentation related to..
2- Should I create the cluster first to be able to benchmark it, or rally can simulate it ?
Rally can do either - but in any case you'll first need to have provisioned the underlying infrastructure (i.e. the machines themselves).
See Setting up a Cluster on how to use Rally to provision Elasticsearch for you.
This should be enough to get you started!
Thanks,
Brad
Thanks @Bradley_Deam , the documentation helped me to understand how rally works,
I am using the command bellow to test the benchmark :
esrally race --track=http_logs --target-hosts=10.13.81.11:9200,10.13.81.12:9200,10.13.81.13:9200 --pipeline=benchmark-only --client-options='use_ssl:true,verify_cert:true,ca_certs:'ca.crt',client_cert:'beat.crt',client_key:'beat.key',basic_auth_user:'elastic',basic_auth_password:'My_Passwd'' --track-params='number_of_shards:1' --track-params='number_of_replicas:0'
I have just 1 final question :
I would like to benchmark the indexing and searching performance in the same time. for the search for example for 5 users with a targeted throughput of 1000 OPS.
Could you please tell me which options to add to my command ?
Best regards
Sorry for the delay here, the notification of your reply got lost in my inbox 
I would like to benchmark the indexing and searching performance in the same time. for the search for example for 5 users with a targeted throughput of 1000 OPS.
Could you please tell me which options to add to my command ?
Sure - this sounds like a great fit for a parallel block - see the docs here. The docs also contain an example of concurrent indexing and querying..
In this scenario, we run indexing and a few queries in parallel with a total of 14 clients:
schedule": [
{
"parallel": {
"tasks": [
{
"operation": "bulk",
"warmup-time-period": 120,
"time-period": 3600,
"clients": 8,
"target-throughput": 50
},
{
"operation": "default",
"clients": 2,
"warmup-iterations": 50,
"iterations": 100,
"target-throughput": 50
},
{
"operation": "term",
"clients": 2,
"warmup-iterations": 50,
"iterations": 100,
"target-throughput": 200
},
{
"operation": "phrase",
"clients": 2,
"warmup-iterations": 50,
"iterations": 100,
"target-throughput": 200
}
]
}
}
Note that the target-throughput specifies the number of requests per second over all clients, so a target-throughput: 50 with clients: 2 means each client will aim to perform ~25 op/s.