How to Calculate Throughput(Ops/sec) from ESRally Report.md

Balmukund · June 3, 2019, 9:57am

Hi All,
I am running 232 queries per minute through ESRally and also having 232 clients.

Below are few of my report.md file contents.

All,Min Throughput,custom_simple_query_228,0.13,ops/s
All,Median Throughput,custom_simple_query_228,0.17,ops/s
All,Max Throughput,custom_simple_query_228,0.17,ops/s
All,50th percentile latency,custom_simple_query_228,254735.82665342838,ms
All,90th percentile latency,custom_simple_query_228,453923.8566603046,ms
All,99th percentile latency,custom_simple_query_228,498532.68128809053,ms
All,100th percentile latency,custom_simple_query_228,502637.6438322477,ms
All,50th percentile service time,custom_simple_query_228,5885.677504120395,ms
All,90th percentile service time,custom_simple_query_228,6185.520075401291,ms
All,99th percentile service time,custom_simple_query_228,6512.958331475965,ms
All,100th percentile service time,custom_simple_query_228,7637.462149839848,ms
All,error rate,custom_simple_query_228,0.00,%
All,Min Throughput,custom_simple_query_229,0.13,ops/s
All,Median Throughput,custom_simple_query_229,0.16,ops/s
All,Max Throughput,custom_simple_query_229,0.16,ops/s
All,50th percentile latency,custom_simple_query_229,253916.0132848192,ms
All,90th percentile latency,custom_simple_query_229,453343.4856909327,ms
All,100th percentile latency,custom_simple_query_229,503350.4834943451,ms
All,50th percentile service time,custom_simple_query_229,6121.4104143437,ms
All,90th percentile service time,custom_simple_query_229,6363.997411495075,ms
All,100th percentile service time,custom_simple_query_229,7758.432420901954,ms
All,error rate,custom_simple_query_229,0.00,%
All,Min Throughput,custom_simple_query_230,0.11,ops/s
All,Median Throughput,custom_simple_query_230,0.13,ops/s
All,Max Throughput,custom_simple_query_230,0.14,ops/s
All,50th percentile latency,custom_simple_query_230,263817.0460090041,ms
All,90th percentile latency,custom_simple_query_230,469282.91428880766,ms
All,100th percentile latency,custom_simple_query_230,520855.9247748926,ms
All,50th percentile service time,custom_simple_query_230,7394.637248013169,ms
All,90th percentile service time,custom_simple_query_230,7630.584071855992,ms
All,100th percentile service time,custom_simple_query_230,8723.47238380462,ms
All,error rate,custom_simple_query_230,0.00,%
All,Min Throughput,custom_simple_query_231,0.13,ops/s
All,Median Throughput,custom_simple_query_231,0.17,ops/s
All,Max Throughput,custom_simple_query_231,0.18,ops/s
All,50th percentile latency,custom_simple_query_231,250339.09847494215,ms
All,90th percentile latency,custom_simple_query_231,446924.7991587036,ms
All,99th percentile latency,custom_simple_query_231,491856.99790116394,ms
All,100th percentile latency,custom_simple_query_231,496650.9179570712,ms
All,50th percentile service time,custom_simple_query_231,5686.616131104529,ms
All,90th percentile service time,custom_simple_query_231,6000.503335334361,ms
All,99th percentile service time,custom_simple_query_231,6214.789305739104,ms
All,100th percentile service time,custom_simple_query_231,7654.282569885254,ms
All,error rate,custom_simple_query_231,0.00,%
All,Min Throughput,custom_simple_query_232,0.12,ops/s
All,Median Throughput,custom_simple_query_232,0.14,ops/s
All,Max Throughput,custom_simple_query_232,0.14,ops/s
All,50th percentile latency,custom_simple_query_232,263251.6680445988,ms
All,90th percentile latency,custom_simple_query_232,468660.6013576966,ms
All,100th percentile latency,custom_simple_query_232,518969.1438791342,ms
All,50th percentile service time,custom_simple_query_232,6980.2779571618885,ms
All,90th percentile service time,custom_simple_query_232,7212.950173066929,ms
All,100th percentile service time,custom_simple_query_232,8547.225694172084,ms
All,error rate,custom_simple_query_232,0.00,%

So, how to calculate overall Search queries Ops/Sec ?

--Regards,
Balmukund

danielmitterdorfer · June 5, 2019, 7:27am

Hi,

If I understand you correctly you run multiple queries (in parallel?) and want to calculate a single throughput metric for all queries together (e.g. throughput across all queries: X ops/s). While it would be (theoretically) possible to calculate this (see Rally's source code how it's done on per-task basis), I'd be interested to hear how you'd use that metric.

Daniel

Balmukund · June 6, 2019, 9:31am

Hi Daniel,
Thank you very much for your quick response. Yes, you are right, i'm running multiple queries in parallel.
Suppose,
Query q1 have ops per second p1,
Query q2 have ops per second p2,
Query q3 have ops per second p3,

So, currently,i am calculating it by just adding total ops/sec as (p1+p2+p3).
So, just wanted your confirmation whether, its correct or not?

--Regards,
Balmukund

danielmitterdorfer · June 7, 2019, 5:23am

Hi Balmukund,

the number you're calculating is based on summary statistics. If you add up e.g. the maximum throughput of all queries it is very likely that you overestimate what the system can achieve because that maximum for each individual query could be reached at different points in time during the benchmark. I also noticed that the difference between service time and latency is very high in some cases and this indicates that your benchmark is not in a stable state, i.e. your target throughput is too high (see our FAQ and the workload section of our blog post Seven Tips for Better Elasticsearch Benchmarks for details).

In any case, to get an accurate picture I think you'd need to calculate the achieved throughput based on the raw samples. I don't know what these queries represent but I imagine if they are issued by your application to process a single customer request then it would be better to write a custom runner that executes all the operations that your application executes in the same order. This would be more realistic and also Rally would automatically show the correct metrics already for the high-level operation you're interested in. An alternative to this approach would be to benchmark your application directly (e.g. with JMeter or other load testing tools) as this would provide you with end-to-end metrics.

Daniel

Balmukund · June 7, 2019, 10:16am

Hi Daniel,
Thank you very much for your response. You are right, if i consider the maximum throughput of all queries it is very likely that we overestimate what the system can achieve. Hence, i am using All,Median Throughput.
Also, I am using Rally's 1 Billions track to test the system for CPU, Memory and Disk IO Utilization.
Below is my simple query:
"query": {
"match" : {
"nginx.access.geoip.city_name": "Frankfort"
}
}

Also, Average response time calculating by summing all the value of "All,99th percentile service time" and averaging it.
i.e. If Query 1 has All,99th percentile service time as t1
Query 2 has All,99th percentile service time as t2
So, Average Response time = (t1+t2)/2;

Please, let me know if my calculation is wrong.

--Regards,
Balmukund

danielmitterdorfer · June 11, 2019, 5:45am

Hi,

I chose the maximum as an example where it is quite clear that you might overestimate the system's true capabilities but a similar reasoning applies to all summary statistics. I've provided an alternative to that in my earlier answer.

I guess by "average response time" you mean "average 99th percentile service time"? Unfortunately, percentiles (as well as minimum, median and maximum) only make sense in the context of the measurements for which they have been taken and you cannot aggregate them. You have two ways out of this: Calculate your summary statistics based on the raw samples that you'll find in the index rally-metrics-* in your Elasticsearch metrics store or use the mean value in rally-results-* that is available since Rally 1.1.0. However, I don't think that you should summarize results of two different queries. Let me provide an analogy: Say, it takes a truck to drive 10 hours from A to B and for the same distance it takes a sports car 4 hours. Averaging the times of the sports car and the truck gives (10 + 4)/2 = 7 hours. I am not sure how calculating this number helps you?

A great video is How NOT to measure latency which provides a lot of details what can go wrong when measuring latency and how to do it right.

Daniel

Balmukund · June 14, 2019, 8:49am

Hi Daniel,
sorry for the delayed response. Thank you very much for your response. I understood, calculating average by taking the average of 99th percentile is not making more sense. But I'm unable to see the rally-metrics-* or rally-results-* files.
It would be great if you could provide me the path of these files.

--Regards,
Balmukund

danielmitterdorfer · June 17, 2019, 1:49pm

Hi,

these are not files but index patterns that match indices that are created by Rally when you setup a dedicated Elasticsearch metrics store. See also our documentation about metrics records.

Daniel

Balmukund · July 10, 2019, 5:08am

Hi Deniel,
Sorry for delay response. Thank you for your inputs.

--Regards,
Balmukund

system · August 7, 2019, 5:09am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to calculate the throughput for search operation in Elasticsearch by using Rally Elasticsearch rally	1	501	June 6, 2019
Rally Summary Report and Kibana Elasticsearch rally	2	796	February 16, 2018
Rally Track Report Analysis Elasticsearch rally	7	1570	March 20, 2018
Need Details on ops/sec Elasticsearch rally	10	981	June 11, 2019
No Throughput result in the summary report Elasticsearch rally	9	2003	March 31, 2017

How to Calculate Throughput(Ops/sec) from ESRally Report.md

Related topics