Latency reporting and rate limitting in logging-indexing-querying track

Hi,

we're using rally for performance evaluation. In our case, it's about the effect of a JVM to Elasticsearch's performance (disclaimer: I work for Azul).
We would like to use challenge "elastic/logs", track "logging-indexing-querying" as it, based on our experience, represents quite a realistic scenario - customers constantly indexing new logs while doing search queries in parallel. Which is exactly what the "logging-indexing-querying" track should be doing.
When we tried it, we found a couple of problems. Let me go through them separately.

Problem #1: Lack of latencies of respective queries
When executing the track, here's the result we get:

|                                                         Metric |                    Task |            Value |   Unit |
|---------------------------------------------------------------:|------------------------:|-----------------:|-------:|
|                     Cumulative indexing time of primary shards |                         |      1.56297     |    min |
|             Min cumulative indexing time across primary shards |                         |      0           |    min |
|          Median cumulative indexing time across primary shards |                         |      0.0449667   |    min |
|             Max cumulative indexing time across primary shards |                         |      0.838067    |    min |
|            Cumulative indexing throttle time of primary shards |                         |      0           |    min |
|    Min cumulative indexing throttle time across primary shards |                         |      0           |    min |
| Median cumulative indexing throttle time across primary shards |                         |      0           |    min |
|    Max cumulative indexing throttle time across primary shards |                         |      0           |    min |
|                        Cumulative merge time of primary shards |                         |      0.1191      |    min |
|                       Cumulative merge count of primary shards |                         |      5           |        |
|                Min cumulative merge time across primary shards |                         |      0           |    min |
|             Median cumulative merge time across primary shards |                         |      0           |    min |
|                Max cumulative merge time across primary shards |                         |      0.0779833   |    min |
|               Cumulative merge throttle time of primary shards |                         |      0           |    min |
|       Min cumulative merge throttle time across primary shards |                         |      0           |    min |
|    Median cumulative merge throttle time across primary shards |                         |      0           |    min |
|       Max cumulative merge throttle time across primary shards |                         |      0           |    min |
|                      Cumulative refresh time of primary shards |                         |      0.143067    |    min |
|                     Cumulative refresh count of primary shards |                         |    121           |        |
|              Min cumulative refresh time across primary shards |                         |      0           |    min |
|           Median cumulative refresh time across primary shards |                         |      0.00335     |    min |
|              Max cumulative refresh time across primary shards |                         |      0.0556167   |    min |
|                        Cumulative flush time of primary shards |                         |      0.00198333  |    min |
|                       Cumulative flush count of primary shards |                         |     13           |        |
|                Min cumulative flush time across primary shards |                         |      0           |    min |
|             Median cumulative flush time across primary shards |                         |      8.33333e-05 |    min |
|                Max cumulative flush time across primary shards |                         |      0.00075     |    min |
|                                                     Store size |                         |      0.101498    |     GB |
|                                                  Translog size |                         |      6.65896e-07 |     GB |
|                                         Heap used for segments |                         |      0           |     MB |
|                                       Heap used for doc values |                         |      0           |     MB |
|                                            Heap used for terms |                         |      0           |     MB |
|                                            Heap used for norms |                         |      0           |     MB |
|                                           Heap used for points |                         |      0           |     MB |
|                                    Heap used for stored fields |                         |      0           |     MB |
|                                                  Segment count |                         |     52           |        |
|                                    Total Ingest Pipeline count |                         | 400000           |        |
|                                     Total Ingest Pipeline time |                         |     69.393       |      s |
|                                   Total Ingest Pipeline failed |                         |      0           |        |
|                                                 Min Throughput |        insert-pipelines |     10.29        |  ops/s |
|                                                Mean Throughput |        insert-pipelines |     10.29        |  ops/s |
|                                              Median Throughput |        insert-pipelines |     10.29        |  ops/s |
|                                                 Max Throughput |        insert-pipelines |     10.29        |  ops/s |
|                                       100th percentile latency |        insert-pipelines |   1455.15        |     ms |
|                                  100th percentile service time |        insert-pipelines |   1455.15        |     ms |
|                                                     error rate |        insert-pipelines |      0           |      % |
|                                                 Min Throughput |              insert-ilm |     35.39        |  ops/s |
|                                                Mean Throughput |              insert-ilm |     35.39        |  ops/s |
|                                              Median Throughput |              insert-ilm |     35.39        |  ops/s |
|                                                 Max Throughput |              insert-ilm |     35.39        |  ops/s |
|                                       100th percentile latency |              insert-ilm |     26.9471      |     ms |
|                                  100th percentile service time |              insert-ilm |     26.9471      |     ms |
|                                                     error rate |              insert-ilm |      0           |      % |
|                                                     error rate |         discover/search |      0           |      % |
|                                                     error rate |      discover/visualize |      0           |      % |
|                                                     error rate |                   kafka |      0           |      % |
|                                                     error rate |                   nginx |      0           |      % |
|                                                     error rate |                  apache |      0           |      % |
|                                                     error rate |             system/auth |      0           |      % |
|                                                     error rate | system/syslog/dashboard |      0           |      % |
|                                                     error rate |      system/syslog/lens |      0           |      % |
|                                                     error rate |         mysql/dashboard |      0           |      % |
|                                                     error rate |                   redis |      0           |      % |
|                                                     error rate |              mysql/lens |      0           |      % |
|                                                     error rate |     postgresql/overview |      0           |      % |
|                                                     error rate |     postgresql/duration |      0           |      % |
|                                                 Min Throughput |              bulk-index |   1197.09        | docs/s |
|                                                Mean Throughput |              bulk-index |  12379.2         | docs/s |
|                                              Median Throughput |              bulk-index |  13364.8         | docs/s |
|                                                 Max Throughput |              bulk-index |  16276.7         | docs/s |
|                                        50th percentile latency |              bulk-index |    285.791       |     ms |
|                                        90th percentile latency |              bulk-index |    827.314       |     ms |
|                                        99th percentile latency |              bulk-index |   1304.18        |     ms |
|                                       100th percentile latency |              bulk-index |   2621.23        |     ms |
|                                   50th percentile service time |              bulk-index |    285.791       |     ms |
|                                   90th percentile service time |              bulk-index |    827.314       |     ms |
|                                   99th percentile service time |              bulk-index |   1304.18        |     ms |
|                                  100th percentile service time |              bulk-index |   2621.23        |     ms |
|                                                     error rate |              bulk-index |      0           |      % |

The way we see it, we see only error rates of respective queries like nginx, kafka etc. However, we would also like to see what are the latencies (= how long it took to complete the query) of those respective queries. Are we missing something? Is there a way how to retrieve that information?

Problem #2: Length of the benchmark
Based on our observation, it looks like those queries are the benchmark lasts for the amount of time it takes the indexing part to finish. And during that time, the search queries are executed.
Would it be possible to specify the length based on time? Let's say "do the indexing and queries (repeat the set of queries one by one) for X seconds?"?

Problem #3: Rate limiting
The scenario that we want to evaluate the JVMs on is that we would like to compare the latencies of the search queries while doing the indexing. In order to do that and to have an apples-to-apples comparison, we need to be able to fix the amount of indexing requests and possibly even the amount of search requests per certain time.
We haven't found any way how to achieve that. In other tracks, such as "sql", there's "target-throughput" directive that seems to do pretty much exactly what we want. Is this something that can be used in "logging-indexing-querying" as well, or is it track specific? Our current experiments suggest the latter.

Thank you very much for any comments, highly appreciated.

Jiri

Would you mind posting your esrally ... invocation, along with all of the parameters?

Just based on that output it doesn't look like the queries ever run because the benchmark is over very quickly. I see that the total store size is only ~100MB:

Store size |                         |      0.101498    |     GB |

The track itself has the bulk-index and respective workflow query task inside the same parallel operation, which explicitly sets itself to complete when bulk indexing finishes:

     "name": "logging-queries-and-indexing",
      "parallel": {
        "completed-by": "bulk-index",
        "tasks": [
          {% for workflow in p_query_workflows %}
            {
              "name": {{workflow | tojson }},
              "operation": {
                "operation-type": "composite",
                "param-source": "workflow-selector",
                "workflow": {{workflow | tojson }},
                "task-offset": {{ loop.index }},
                "request-params": {{ p_query_request_params | tojson(indent=2) }}
              },
              "think-time-interval": {{ p_user_think_time }},
              "workflow-interval": {{ p_user_workflow_time }},
              "time-period": 99999,
              "warmup-time-period": {{ p_query_warmup_time_period }},
              "clients": 1,
              "schedule": "workflow-scheduler"
            },
          {% endfor %}
            {
              "name": "bulk-index",
              "operation": {
                "operation-type": "raw-bulk",
                "param-source": "processed-source",
                "time-format": "milliseconds",
                "profile": "fixed_interval",
                "bulk-size": {{ p_bulk_size }},
                "detailed-results": true
              },
              "clients": {{ p_bulk_indexing_clients }}{% if p_throttle_indexing %},
              "ignore-response-error-level": "{{error_level | default('non-fatal')}}",
              "schedule": "timestamp-throttler",
              "max-delay-secs": 1
              {% endif %}

I think you probably want to increase the raw_data_volume_per_day value to something > 100MB:

raw_data_volume_per_day (default: 0.1GB) - The volume of raw data to index per day.

Problem #2: Length of the benchmark
Based on our observation, it looks like those queries are the benchmark lasts for the amount of time it takes the indexing part to finish. And during that time, the search queries are executed.
Would it be possible to specify the length based on time? Let's say "do the indexing and queries (repeat the set of queries one by one) for X seconds?"?

You can do this, but you'll need to modify the track or make a copy yourself and tell Rally to use it via esrally race ... --track-path=/path/to/my/track.json.

Our Track Reference docs have a section on 'Time-based vs. iteration-based' operations, but it's worth mentioning that the track is currently time based, it's just that the tasks is inside the same parallel operation and the parallel operation explicitly sets itself to be completed-by the bulk-index task.

Problem #3: Rate limiting
The scenario that we want to evaluate the JVMs on is that we would like to compare the latencies of the search queries while doing the indexing. In order to do that and to have an apples-to-apples comparison, we need to be able to fix the amount of indexing requests and possibly even the amount of search requests per certain time.
We haven't found any way how to achieve that. In other tracks, such as "sql", there's "target-throughput" directive that seems to do pretty much exactly what we want. Is this something that can be used in "logging-indexing-querying" as well, or is it track specific? Our current experiments suggest the latter.

The indexing part of the track will run until the corpora has been exhausted. In this instance the elastic/logs track has a raw_data_volume_per_day parameter that adjusts the total amount of data to index per-24h period, the docs cover this in some detail.

I think focusing on the exact number of indexing requests is probably the wrong metrics to rate limit here. If you keep the raw_data_volume_per_day value the same between benchmarks in addition to setting the track param throttle_indexing, then you should get the exact same distribution of requests.

To follow up re-rate limiting for the elastic/logs track, there are a few parameters you'll want to look at:

  • raw_data_volume_per_day
  • throttle_indexing
  • start_date
  • end_date

By default, Rally will try and index (raw_data_volume_per_day * start_date -> end_date)GB of logs as fast as possible.

For example, 10GB * 2020-01-01 -> 2020-01-02 would mean we'd try and index 20GB as fast as possible (maybe it'll only take a few minutes if your hardware is beefy enough).

To limit the throughput you can set throttle_indexing, which will force Rally to throttle indexing to the rate to only 10GB worth of logs over a 24h window.

Getting the correct figure for raw_data_volume_per_day is going to depend on the hardware of the cluster, but as a rule of thumb we usually try to target an ~80% disk utilisation fill rate per node.

Hi Bradley,

Thanks for a super-quick response :slight_smile:

We don't run anything special, our rally command is:

esrally race --offline --pipeline=benchmark-only --track elastic/logs --challenge logging-indexing-querying --report-file=REPORT.md --target-hosts=localhost:9200 --track-params=number_of_replicas:0"

We're using the latest version 2.7.0 and Elasticsearch version of 8.5.3.

Thanks for the points. Looks like our benchmarks is thus not running long enough. Also, the throttle_indexing suggestion looks like exactly what we need. We'll play with the options and I'll get back to you.

Thanks,
Jiri

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.