Rally benchmark results have much smaller latency than real

fatcloud · February 23, 2024, 4:35am

I wrote a Rally benchmark with a single search query operation. The Rally benchmark result shows a much smaller p50 latency(~200ms) compared to the latency I observe when i just use curl command to send same query(several seconds).

I already set cache to false for both index setting and in the Rally benchmark. Wondering if Rally benchmark affected by request cache? Then how is that percentile latency useful?

Christian_Dahlqvist · February 23, 2024, 5:36am

One big difference between Rally and curl is that curl establishes a new connection for every request, which adds a lot of overhead and latency, especially if the connection is encrypted. Rally, like all language clients, set up a pool of connections and reuse them. Not sure how much of the latency difference that explains.

Is the query you are sending through curl and Rally equivalent? How is it configured? What happens if you send the curl request several times in a row?

If you are sending the same query through multiple times in a row and comparing that to a single curl request it is possible that data may have been loaded into the page cache and be served from memory rather than requiring disk access. That depends a lot on the amount of data you have and the load the cluster is under, but could make a big difference if the cluster is under load and you e.g. have slow storage.

gareth-ellis · February 23, 2024, 10:05am

Hello,

To add to what Christian said, remember that the latency of both rally and curl is made up of a number of components, with rally reusing connections so not every request will need to do a dns lookup and connect.

Rally uses the python client that Elastic provides, so the idea is that rally should be accessing Elasticsearch in the same way that most users of Elasticsearch are.

I would suggest a good start would be first to validate that both curl and rally are accessing Elasticsearch the same way - e.g are both running from the same server, or is curl for example running from your local machine and possibly experiencing more latency due to network time.

Curl has the option to print out various internal variables, some of which include which steps took time. You can try using the -w flag - see curl - How To Use
I would suggest including all the time_* variables and see where the time is going

Gareth

system · March 22, 2024, 10:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Compare query speeds of two different indices Elasticsearch	3	385	April 30, 2019
Benchmarking High Volumes Elasticsearch rally	2	508	May 11, 2019
Can't reach Rally Target throughput Elasticsearch rally	8	1128	October 15, 2018
Benchmarking cluster with rally Elasticsearch rally	3	1210	August 23, 2021
Rally Track Report Analysis Elasticsearch rally	7	1537	March 20, 2018

Rally benchmark results have much smaller latency than real

Related topics