[WARNING] No throughput metrics available for [index-append]. Likely cause: The benchmark ended already during warmup

cluster is running on kubernetes

NUMBER_OF_SHARDS=${NUMBER_OF_SHARDS:-10}
NUMBER_OF_REPLICAS=${NUMBER_OF_REPLICAS:-0}
INGEST_PERCENTAGE=${INGEST_PERCENTAGE:-100}
BULK_SIZE=${BULK_SIZE:-5000}
BULK_INDEXING_CLIENTS=${BULK_INDEXING_CLIENTS:-8}
REFRESH_INTERVAL=${REFRESH_INTERVAL:--1}

Heap size is 31 gb
tracks: geopoint
1 data node : 31gb heap, 16 vcpu
1 master node : 31gb heap, 16vcpu

index-append error rate is 0.00%
Kindly help, why am i getting this error

I don't see any error in logs as well

    ____        ____
3:    / __ \____ _/ / /_  __
3:   / /_/ / __ `/ / / / / /
3:  / _, _/ /_/ / / / /_/ /
3: /_/ |_|\__,_/_/_/\__, /
3:                 /____/
3:
3: [INFO] Decompressing track data from [/rally/.rally/benchmarks/data/geopoint/documents.json.bz2] to [/rally/.rally/benchmarks/data/geopoint/documents.json] (resulting size: [2.28] GB) ... [OK]
3: [INFO] Preparing file offset table for [/rally/.rally/benchmarks/data/geopoint/documents.json] ... [OK]
3: Running delete-index                                                           [100% done]
Running create-index                                                           [100% done]
Running check-cluster-health                                                   [100% done]
Running index-append                                                           [100% done]
Running refresh-after-index                                                    [100% done]
Running force-merge                                                            [100% done]
Running refresh-after-force-merge                                              [100% done]
Running wait-until-merges-finish                                               [100% done]
Running polygon                                                                [100% done]
Running bbox                                                                   [100% done]
Running distance                                                               [100% done]
Running distanceRange                                                          [100% done][INFO] Racing on track [geopoint], challenge [append-no-conflicts] and car ['external'] with version [8.2.0].
3:
3:
3: ------------------------------------------------------
3:     _______             __   _____
3:    / ____(_)___  ____ _/ /  / ___/_________  ________
3:   / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
3:  / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
3: /_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
3: ------------------------------------------------------
3:
3: Metric,Task,Value,Unit
3: Cumulative indexing time of primary shards,,12.6378,min
3: Min cumulative indexing time across primary shards,,2.44865,min
3: Median cumulative indexing time across primary shards,,2.47545,min
3: Max cumulative indexing time across primary shards,,2.7270833333333333,min
3: Cumulative indexing throttle time of primary shards,,0,min
3: Min cumulative indexing throttle time across primary shards,,0,min
3: Median cumulative indexing throttle time across primary shards,,0,min
3: Max cumulative indexing throttle time across primary shards,,0,min
3: Cumulative merge time of primary shards,,0.005716666666666667,min
3: Cumulative merge count of primary shards,,5,
3: Min cumulative merge time across primary shards,,0.0004333333333333333,min
3: Median cumulative merge time across primary shards,,0.00065,min
3: Max cumulative merge time across primary shards,,0.0027166666666666667,min
3: Cumulative merge throttle time of primary shards,,0,min
3: Min cumulative merge throttle time across primary shards,,0,min
3: Median cumulative merge throttle time across primary shards,,0,min
3: Max cumulative merge throttle time across primary shards,,0,min
3: Cumulative refresh time of primary shards,,0.4688333333333333,min
3: Cumulative refresh count of primary shards,,40,
3: Min cumulative refresh time across primary shards,,0.08556666666666667,min
3: Median cumulative refresh time across primary shards,,0.09646666666666667,min
3: Max cumulative refresh time across primary shards,,0.0972,min
3: Cumulative flush time of primary shards,,2.03135,min
3: Cumulative flush count of primary shards,,15,
3: Min cumulative flush time across primary shards,,0.39188333333333336,min
3: Median cumulative flush time across primary shards,,0.4111333333333333,min
3: Max cumulative flush time across primary shards,,0.42086666666666667,min
3: Total Young Gen GC time,,1.502,s
3: Total Young Gen GC count,,47,
3: Total Old Gen GC time,,0,s
3: Total Old Gen GC count,,0,
3: Store size,,3.163294860161841,GB
3: Translog size,,2.561137080192566e-07,GB
3: Heap used for segments,,0,MB
3: Heap used for doc values,,0,MB
3: Heap used for terms,,0,MB
3: Heap used for norms,,0,MB
3: Heap used for points,,0,MB
3: Heap used for stored fields,,0,MB
3: Segment count,,97,
3: error rate,index-append,0.00,%
3: Min Throughput,polygon,2.00,ops/s
3: Mean Throughput,polygon,2.00,ops/s
3: Median Throughput,polygon,2.00,ops/s
3: Max Throughput,polygon,2.01,ops/s
3: 50th percentile latency,polygon,34.06202851328999,ms
3: 90th percentile latency,polygon,35.03835282754153,ms
3: 99th percentile latency,polygon,36.454293074784836,ms
3: 100th percentile latency,polygon,43.34578406997025,ms
3: 50th percentile service time,polygon,32.78130997205153,ms
3: 90th percentile service time,polygon,33.77172515029088,ms
3: 99th percentile service time,polygon,34.83832438359972,ms
3: 100th percentile service time,polygon,41.91419004928321,ms
3: error rate,polygon,0.00,%
3: Min Throughput,bbox,2.00,ops/s
3: Mean Throughput,bbox,2.01,ops/s
3: Median Throughput,bbox,2.01,ops/s
3: Max Throughput,bbox,2.01,ops/s
3: 50th percentile latency,bbox,42.071087984368205,ms
3: 90th percentile latency,bbox,44.00668186135591,ms
3: 99th percentile latency,bbox,54.396707259584225,ms
3: 100th percentile latency,bbox,55.51117891445756,ms
3: 50th percentile service time,bbox,40.859389933757484,ms
3: 90th percentile service time,bbox,42.89666726253927,ms
3: 99th percentile service time,bbox,53.70071197277866,ms
3: 100th percentile service time,bbox,53.93424991052598,ms
3: error rate,bbox,0.00,%
3: Min Throughput,distance,5.01,ops/s
3: Mean Throughput,distance,5.01,ops/s
3: Median Throughput,distance,5.01,ops/s
3: Max Throughput,distance,5.01,ops/s
3: 50th percentile latency,distance,11.564237996935844,ms
3: 90th percentile latency,distance,12.179009604733437,ms
3: 99th percentile latency,distance,14.521090824855493,ms
3: 100th percentile latency,distance,14.75015701726079,ms
3: 50th percentile service time,distance,10.548858088441193,ms
3: 90th percentile service time,distance,11.092877946794033,ms
3: 99th percentile service time,distance,13.455560109578073,ms
3: 100th percentile service time,distance,13.488444034010172,ms
3: error rate,distance,0.00,%
3: Min Throughput,distanceRange,0.50,ops/s
3: Mean Throughput,distanceRange,0.50,ops/s
3: Median Throughput,distanceRange,0.50,ops/s
3: Max Throughput,distanceRange,0.50,ops/s
3: 50th percentile latency,distanceRange,1219.5492314640433,ms
3: 90th percentile latency,distanceRange,1234.1301434440538,ms
3: 99th percentile latency,distanceRange,1242.6772359397728,ms
3: 100th percentile latency,distanceRange,1252.7494929963723,ms
3: 50th percentile service time,distanceRange,1217.9977719206363,ms
3: 90th percentile service time,distanceRange,1232.5072522158735,ms
3: 99th percentile service time,distanceRange,1241.182112590177,ms
3: 100th percentile service time,distanceRange,1250.9003090672195,ms
3: error rate,distanceRange,0.00,%
3:
3: [WARNING] No throughput metrics available for [index-append]. Likely cause: The benchmark ended already during warmup.
3:
3: [INFO] Race id is [7511a44f-8e47-4070-ae1d-41546331e905]
3:
3: ----------------------------------
3: [INFO] SUCCESS (took 1133 seconds)
3: ----------------------------------

Any reply on this ....its been two days i am not able to figure it out. whats happening?

Hi @json Jason Bryan can you help me with this issue? I have been stuck for 3 days

Hello, this is surprising. Have you tried reproducing this outside of your Kubernetes environment? What is the Rally invocation? Can you please share the logs?

Hi @Quentin_Pradet,

I haven't tried it outside kubernetes environment. I have elastic cluster setup on kubernetes.
Invocation command is as below

ELASTIC_EP=https://es-master:9200
CLIENT_OPTIONS="basic_auth_user:rally,basic_auth_password:changeme,timeout:120,use_ssl:true,verify_certs:false,ca_certs:/rally/cacert.pem"
echo "${ES_RALLY_RACE_params_json}  ${ES_RALLY_RACE} ${ELASTIC_EP} ${ES_TESTMODE} ${CLIENT_OPTIONS}"

echo "${ES_NO_OF_SHARDS} ${ES_NO_OF_REPLICAS} ${ES_INGEST_PERCENTAGE} ${ES_BULKSIZE} ${ES_BULK_INDEXING_CLIENT} ${ES_REFRESH_INTERVAL}"

esrally race --offline --track-params='{"number_of_shards":'${ES_NO_OF_SHARDS}',"number_of_replicas":'${ES_NO_OF_REPLICAS}',"ingest_percentage":'${ES_INGEST_PERCENTAGE}',"bulk_size":'${ES_BULKSIZE}',"bulk_indexing_clients":'${ES_BULK_INDEXING_CLIENT}',"index_settings": { "index.refresh_interval":'${ES_REFRESH_INTERVAL}' }}' --track-path=/rally/.rally/benchmarks/tracks/default/${ES_RALLY_RACE} --pipeline=benchmark-only --target-hosts=${ELASTIC_EP} ${ES_TESTMODE} --client-options ${CLIENT_OPTIONS} --report-format=csv

Log is bigger in size. How can i share the logs. let me know

with geopoint i am getting this error with default param and with nyc_taxis i am getting this warning with 16, 32,64,128 clinets with bulksize 10000.

Any update on the above issue. I am not able to upload the logs on Elastic Upload Service : Login as i am not able to login with the email i am using. Can i share it some where else.?

Sorry the correct link is Elastic Upload Service : Upload.

Anyway, we discussed this with @dliappis and it turns out the warning is probably accurate:

[WARNING] No throughput metrics available for [index-append]. Likely cause: The benchmark ended already during warmup.

Can you try reducing the warmup in those tracks and see if you get results?

Hi @Quentin_Pradet

Thankyou for your quick response.
I had the same doubt as you mentioned. So reducing the warmup in those tracks would impact the results.

I will test it and share the details.

Additionally you should consider whether these standard workloads are suitable for what you want to benchmark. Are they representative of the way your organization uses the Elastic stack? One indication is exactly what you just saw i.e. that the warmup time is more than the actual time taken to index, therefore the datasets might be too small, at least for indexing; you'd need to consider the size of your cluster in relation to the size of the workload.

I strongly recommend watching this talk and work on creating a dataset that is representative to your own use case.

Hi @dliappis ,

Thankyou for the response.

as you mentioned above indication about warmup time is more then being data indexed time is true.

I am looking to see the performance of server like cpu utilization, memory usage , disk usage on different platforms while benchmarking Elasticsearch.

I am trying to figure out the saturation point for cpu usage , memory usage and disk usage.

Kindly suggest which data track will be good enough to do so.

If I understood correctly, what you are trying to do i.e. explore the saturation point of hardware resources doesn't seem like a good fit for a macrobenchmarking tool like Rally, but rather a hardware benchmarking suite like the fio benchmarming suite for the I/O side etc.

If you are trying to understand whether server X is better than server Y for the kind of workload that Elasticsearch is serving in your organization, then, as I mentioned earlier, you will need to create your own Rally track that models your use case. Then by running your track and analyzing the resource usage on your servers you can explore whether you can achieve better metrics (e.g. median indexing throughput or lower latency for your queries) and/or satisfy your SLOs as well as understand the bottleneck of your benchmark.

1 Like