The principle of sorting queries

Ceilzcx · September 7, 2023, 9:16am

I simulated some discrete queries using esrally，and found something that confused me.
the first picture is not use sort, and the second use sort. search qps is same.
【99.9th percentile service time】not use sort is smaller than use sort, but
【100th percentile service time】not use sort is bigger than use sort，why？

Quentin_Pradet · September 7, 2023, 9:32am

Thank you for your interest in Rally!

The 100th percentile is the slowest operation of the whole run. Many things can affect it, such as the CPU utilization of the Elasticsearch cluster and the load driver, the Java garbage collector, the Python garbage collector, the disk accesses, etc. I'd suggest running those benchmarks multiple times, it will give you an idea of the reliability of those numbers.

I would also suggest that you look at the difference between latency and service_time: they are different in your case, which proves there is a queuing effect, which will make higher percentiles even less stable.

Finally, why are you saying that search qps is the same? I can see a median throughput of 26 ops/s in the upper screenshot and 107 ops/s in the lower one.

Ceilzcx · September 7, 2023, 10:13am

thank for you reply. sorry, I use the same target-throughput and client, careless think QPS is the same.
what you said is right, I want to pressure measurement, so throughout is setting large, CPU used more than 90%, there will be some accumulation, then influence latency metric. so i need to adjust target-throughout smaller？

Quentin_Pradet · September 7, 2023, 10:29am

It depends on what you're trying to achieve, but yes I would usually recommend to set the target-throughput to less than the op/s you're seeing. Using the lower screenshot as an example, while the median service time is 44ms, the median latency is 143s, or 300x more! At this point the system isn't really usable.

Ceilzcx · September 7, 2023, 10:47am

thanks, ask an unrelated question, In the term query, whether the keyword type is less expensive than the Integer type？I found that the integer query use BKD tree and becomes a range query

Quentin_Pradet · September 7, 2023, 11:50am

I don't know, sorry. I would recommend opening a new topic.

Ceilzcx · September 8, 2023, 9:27am

Excuse me, The same configuration and query conditions for the case, the first query size param i use 10, and the second i use 100, target-through i set is 100, the first return throughout is about 100; the second throughout is only 40, and latency is so big, i run GET _tasks between second, there are very few query tasks, CPU and memory usage is normal(less than 40%), why such a big difference?
elasticsearch version: 7.10.2

system · October 6, 2023, 9:28am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
About the Max Throughput VS target throughtput Elasticsearch rally	5	734	October 29, 2018
Why the percentile latency is several times more than service time Elasticsearch rally	2	1403	January 18, 2017
Can't reach Rally Target throughput Elasticsearch rally	8	1124	October 15, 2018
Internal rally queue in throughput-throttled mode Elasticsearch rally	3	377	October 10, 2022
Query latency/service time are always identical Elasticsearch rally	3	384	September 29, 2022

The principle of sorting queries

Related topics