I am going to setup a Prod cluster, I receive 30GB of logs per day (Need to keep the data 1 month in Hot node and 11 months in warm node )
I did some maths, and I am going to use (3 hot nodes and 4 warm nodes ) [64GB of memory in each node)
I have never used rally before, that's why I would like to ask some questions :
1- How can I benchmark this clutser to see how many eps I can get with this cluster configuration ? 2- Should I create the cluster first to be able to benchmark it, or rally can simulate it ?
I would like to benchmark the indexing and searching performance in the same time. for the search for example for 5 users with a targeted throughput of 1000 OPS.
Could you please tell me which options to add to my command ?
Sorry for the delay here, the notification of your reply got lost in my inbox
I would like to benchmark the indexing and searching performance in the same time. for the search for example for 5 users with a targeted throughput of 1000 OPS.
Could you please tell me which options to add to my command ?
Sure - this sounds like a great fit for a parallel block - see the docs here. The docs also contain an example of concurrent indexing and querying..
In this scenario, we run indexing and a few queries in parallel with a total of 14 clients:
Note that the target-throughput specifies the number of requests per second over all clients, so a target-throughput: 50 with clients: 2 means each client will aim to perform ~25 op/s.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.