What "Cumulative Time" exactly means?

Hello,
It's my first time using Elasticsearch, and I measured the ES performance with rally. I have some quick questions about interpreting the results.
My purpose is to compare the performance of ES in different environments.

  1. I'm confused what "Cumulative Time" exactly means.
    At first, I wanted to know Wall Clock Time. However, I looked it up in many ways and felt that it's not easy. (If there's an easy way to know that, please advise me.)
    I know that it is related to indexing threads. But ultimately, I have no idea what it means, what can be induced from this "Cumulative Time". What can I know from this?
    If this result time is short, can I say that the indexing performance is relatively good?(I mean, in different environments)
    and I wonder if this value is related to bulk indexing clients.

The overall indexing performance can be compared with throughput, but I need some advices on this because I want to compare the performance for each item, such as Merge or Flush time.
I'm also confused about Count and Min/Median/Max Cumulative time values. I'd appreciate it if you could help me understand.

  1. I also checked the indexing time with the Node Stats API, and there was a difference with ESRally's. What's the difference between the Node Stats API and ESRally?

Please advise me if there's a better interpretation method that suits my purpose.
Thanks

Hello,

As you are getting started with Elasticsearch (and benchmarking it) and your questions are (also) related to methodology I strongly recommend to watch the video of how to benchmark Elasticsearch with Rally (link in the docs).

Regarding the summary report and all the fields about cumulative: I'd instead focus on understanding your use case and ensuring you have are using a workload that resembles it and understand what it is you want to measure. e.g. if you have a logging indexing use case, there is no point running a workload that is mostly focused on geopoints or kNN search.

Once you've picked the right workload, you'll need again to follow the above methodology to target an ES cluster that is representative of your production environment. Continuing on the logging example, you'll likely be interested to see what's the indexing throughput you've achieved (median, min, max to begin with). Conversely, if you want to benchmark queries only at a specific throughput, you'll likely want to measure service_time and latency (which should be very similar for a stable benchmark) and see what you get for the median and also some outliers like 95% or 99% percentile. You should read up on the various modes of running a task (docs here).

Regarding the summary report, you can find explanations in this section but for better analysis you should leverage an external Elasticsearch based metric store. You can then leverage metrics at an individual level per operation (in the rally-metrics* index) and all summarized results (in the rally-results* index); more docs here and here.

I also checked the indexing time with the Node Stats API, and there was a difference with ESRally's. What's the difference between the Node Stats API and ESRally?

Rally uses the ES indices stats API/indices-stats.html) to construct metrics like indexing_total_time and sums them up for all indices involved with the benchmark. These stats get collected at the end of the benchmark.

If you are comparing such stats you must make sure that you are retrieving the right metric (for the right index) and remember that how the environment was created matters; if e.g. this is a production ES cluster receiving live data (which is anti pattern for benchmarks anyway) it's potentially not a stable system -- in the worst case, data maybe be ingested in an index that Rally is also writing to. Rally will warn you about a "non fresh" cluster right at the beginning.

3 Likes

Thank you very much for your kind and detailed explanation. It's very helpful to me.
I just feel like I have a lot of things to study. Now I can understand to some extent, thanks to you.
I'll read the links you gave me one by one.
Thank you again for your reply! :smiley:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.