I am doing benchmarking on our cluster using rally but data is not getting ingested in index and getting below warnings.
024-03-26 07:31:51,713 -not-actor-/PID:141 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on
30 second timeout
2024-03-26 07:31:51,741 -not-actor-/PID:142 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 7 times in a row, putting on
30 second timeout
2024-03-26 07:31:52,617 -not-actor-/PID:146 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on
30 second timeout
2024-03-26 07:31:53,85 -not-actor-/PID:145 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 7 times in a row, putting on
30 second timeout
2024-03-26 07:31:53,836 -not-actor-/PID:143 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on
30 second timeout
2024-03-26 07:31:54,789 -not-actor-/PID:140 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on
30 second timeout
2024-03-26 07:31:56,4 -not-actor-/PID:144 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on 3
0 second timeout
2024-03-26 07:31:59,407 -not-actor-/PID:142 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on
30 second timeout
2024-03-26 07:32:01,113 -not-actor-/PID:147 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on
30 second timeout
2024-03-26 07:32:01,390 -not-actor-/PID:145 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 8 times in a row, putting on
30 second timeout
2024-03-26 07:32:01,653 -not-actor-/PID:143 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 9 times in a row, putting on
30 second timeout
2024-03-26 07:32:02,347 -not-actor-/PID:141 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 9 times in a row, putting on
30 second timeout
2024-03-26 07:32:02,527 -not-actor-/PID:140 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 9 times in a row, putting on
30 second timeout
2024-03-26 07:32:02,578 -not-actor-/PID:144 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 9 times in a row, putting on
30 second timeout
2024-03-26 07:32:03,150 -not-actor-/PID:146 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 9 times in a row, putting on
30 second timeout
2024-03-26 07:32:05,919 -not-actor-/PID:142 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 9 times in a row, putting on
30 second timeout
2024-03-26 07:32:08,472 -not-actor-/PID:145 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(https://x.x.x.x:9200)> has failed for 9 times in a row, putting on
30 second timeout
Get below result after race completion
client:- 80
[root@n1pl-pa-hdd02 cms-new-new]# docker run -v /hdfs16/tracks:/rally/.rally/benchmarks/tracks --user root dartsregistry.india.airtel.itm/zabbix/elastic/rally:2.10.0 race --track-path=/rally/.rally/benchmarks/tracks/cms-new-new --pipeline=benchmark-only --target-hosts=10.223.74.35:9200 --client-options "basic_auth_user:'elastic',basic_auth_password:'sxdKecnOukHJhQHPNNwc',use_ssl:true,verify_certs:false,timeout:120"
____ ____
/ __ \____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
/____/
[WARNING] merges_total_time is 17678925401 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] merges_total_throttled_time is 8779844243 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 24486023239 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_throttle_time is 103311 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 1818705576 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 615978166 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index [100% done]
Running create-index [100% done]
Running cluster-health [100% done]
Running bulk [100% done][INFO] Racing on track [cms-new-new] and car ['external'] with version [8.12.0].
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|----------------------------------------------------------:|-----------------:|-------:|
| Cumulative indexing time of primary shards | | 409524 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 14.1264 | min |
| Max cumulative indexing time across primary shards | | 1372.77 | min |
| Cumulative indexing throttle time of primary shards | | 4.36913 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 1.37417 | min |
| Cumulative merge time of primary shards | | 295030 | min |
| Cumulative merge count of primary shards | | 5.17712e+06 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 18.7617 | min |
| Max cumulative merge time across primary shards | | 2266.22 | min |
| Cumulative merge throttle time of primary shards | | 146434 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 2.13259 | min |
| Max cumulative merge throttle time across primary shards | | 1486.86 | min |
| Cumulative refresh time of primary shards | | 30347.7 | min |
| Cumulative refresh count of primary shards | | 6.6217e+07 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 1.1601 | min |
| Max cumulative refresh time across primary shards | | 390.803 | min |
| Cumulative flush time of primary shards | | 10293.4 | min |
| Cumulative flush count of primary shards | | 4.24626e+06 | |
| Min cumulative flush time across primary shards | | 0 | min |
| Median cumulative flush time across primary shards | | 3.98247 | min |
| Max cumulative flush time across primary shards | | 108.61 | min |
| Min ML processing time | airflow_job_duration_span | 0 | ms |
| Mean ML processing time | airflow_job_duration_span | 36.7278 | ms |
| Median ML processing time | airflow_job_duration_span | 6.32557 | ms |
| Max ML processing time | airflow_job_duration_span | 476 | ms |
| Min ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 0 | ms |
| Mean ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 0.0791768 | ms |
| Median ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 0 | ms |
| Max ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 1 | ms |
| Min ML processing time | ucmapm_tx_metrics | 0 | ms |
| Mean ML processing time | ucmapm_tx_metrics | 10.6932 | ms |
| Median ML processing time | ucmapm_tx_metrics | 10.255 | ms |
| Max ML processing time | ucmapm_tx_metrics | 132 | ms |
| Min ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 3 | ms |
| Mean ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 42.4964 | ms |
| Median ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 26.4296 | ms |
| Max ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 426 | ms |
| Min ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 0 | ms |
| Mean ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 4.37006 | ms |
| Median ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 3.11744 | ms |
| Max ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 24 | ms |
| Min ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 0 | ms |
| Mean ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 1.71899 | ms |
| Median ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 1 | ms |
| Max ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 22 | ms |
| Min ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 0 | ms |
| Mean ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 1.27293 | ms |
| Median ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 1 | ms |
| Max ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 28 | ms |
| Min ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 1 | ms |
| Mean ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 5.7039 | ms |
| Median ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 5 | ms |
| Max ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 153 | ms |
| Min ML processing time | storage_estimate | 0 | ms |
| Mean ML processing time | storage_estimate | 1.31634 | ms |
| Median ML processing time | storage_estimate | 1 | ms |
| Max ML processing time | storage_estimate | 54 | ms |
| Min ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 0 | ms |
| Mean ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 1.53396 | ms |
| Median ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 0 | ms |
| Max ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 124 | ms |
| Total Young Gen GC time | | 210.646 | s |
| Total Young Gen GC count | | 1491 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 89242.8 | GB |
| Translog size | | 11.6755 | GB |
| Heap used for segments | | 0 | MB |
| Heap used for doc values | | 0 | MB |
| Heap used for terms | | 0 | MB |
| Heap used for norms | | 0 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0 | MB |
| Segment count | | 73017 | |
| Total Ingest Pipeline count | | 5.42233e+07 | |
| Total Ingest Pipeline time | | 3887.27 | s |
| Total Ingest Pipeline failed | | 0 | |
| Min Throughput | bulk | 2517.93 | docs/s |
| Mean Throughput | bulk | 134547 | docs/s |
| Median Throughput | bulk | 136443 | docs/s |
| Max Throughput | bulk | 146047 | docs/s |
| 50th percentile latency | bulk | 10977.8 | ms |
| 90th percentile latency | bulk | 18302 | ms |
| 99th percentile latency | bulk | 22912.8 | ms |
| 99.9th percentile latency | bulk | 28558.9 | ms |
| 100th percentile latency | bulk | 29208.3 | ms |
| 50th percentile service time | bulk | 10977.8 | ms |
| 90th percentile service time | bulk | 18302 | ms |
| 99th percentile service time | bulk | 22912.8 | ms |
| 99.9th percentile service time | bulk | 28558.9 | ms |
| 100th percentile service time | bulk | 29208.3 | ms |
| error rate | bulk | 0 | % |
[INFO] Race id is [c909ab14-8e8b-4a3b-a572-1c2c9b553110]
---------------------------------
[INFO] SUCCESS (took 999 seconds)
---------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
bulk size :- 20000
client:- 200
[root@n1pl-pa-hdd02 cms-new-new]# docker run -v /hdfs16/tracks:/rally/.rally/benchmarks/tracks --user root dartsregistry.india.airtel.itm/zabbix/elastic/rally:2.10.0 race --track-path=/rally/.rally/benchmarks/tracks/cms-new-new --pipeline=benchmark-only --target-hosts=10.223.74.35:9200 --client-options "basic_auth_user:'elastic',basic_auth_password:'sxdKecnOukHJhQHPNNwc',use_ssl:true,verify_certs:false"
____ ____
/ __ \____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
/____/
[WARNING] merges_total_time is 17740234685 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] merges_total_throttled_time is 8797079411 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 24602730839 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_throttle_time is 262148 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 1822612383 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 618589613 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index [100% done]
Running create-index [100% done]
Running cluster-health [100% done]
Running bulk [100% done][INFO] Racing on track [cms-new-new] and car ['external'] with version [8.12.0].
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|----------------------------------------------------------:|-----------------:|-------:|
| Cumulative indexing time of primary shards | | 409067 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 14.364 | min |
| Max cumulative indexing time across primary shards | | 1375.71 | min |
| Cumulative indexing throttle time of primary shards | | 1.85487 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0.327583 | min |
| Cumulative merge time of primary shards | | 294932 | min |
| Cumulative merge count of primary shards | | 5.17274e+06 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 18.9366 | min |
| Max cumulative merge time across primary shards | | 2268.22 | min |
| Cumulative merge throttle time of primary shards | | 146345 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 2.12753 | min |
| Max cumulative merge throttle time across primary shards | | 1487.98 | min |
| Cumulative refresh time of primary shards | | 30306.5 | min |
| Cumulative refresh count of primary shards | | 6.59936e+07 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 1.15628 | min |
| Max cumulative refresh time across primary shards | | 390.803 | min |
| Cumulative flush time of primary shards | | 10288.4 | min |
| Cumulative flush count of primary shards | | 4.24352e+06 | |
| Min cumulative flush time across primary shards | | 1.66667e-05 | min |
| Median cumulative flush time across primary shards | | 3.98017 | min |
| Max cumulative flush time across primary shards | | 108.918 | min |
| Min ML processing time | airflow_job_duration_span | 0 | ms |
| Mean ML processing time | airflow_job_duration_span | 36.7278 | ms |
| Median ML processing time | airflow_job_duration_span | 6.32557 | ms |
| Max ML processing time | airflow_job_duration_span | 476 | ms |
| Min ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 0 | ms |
| Mean ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 0.0791768 | ms |
| Median ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 0 | ms |
| Max ML processing time | edw_base.pre_to_post_order_defaultcounttrend | 1 | ms |
| Min ML processing time | ucmapm_tx_metrics | 0 | ms |
| Mean ML processing time | ucmapm_tx_metrics | 10.6932 | ms |
| Median ML processing time | ucmapm_tx_metrics | 10.255 | ms |
| Max ML processing time | ucmapm_tx_metrics | 132 | ms |
| Min ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 3 | ms |
| Mean ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 42.5132 | ms |
| Median ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 26.3938 | ms |
| Max ML processing time | kibana-logs-ui-default-default-log-entry-categories-count | 426 | ms |
| Min ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 0 | ms |
| Mean ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 4.36879 | ms |
| Median ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 3.11656 | ms |
| Max ML processing time | kibana-metrics-ui-default-default-hosts_memory_usage | 24 | ms |
| Min ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 0 | ms |
| Mean ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 1.71896 | ms |
| Median ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 1 | ms |
| Max ML processing time | kibana-metrics-ui-default-default-hosts_network_in | 22 | ms |
| Min ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 0 | ms |
| Mean ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 1.27297 | ms |
| Median ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 1 | ms |
| Max ML processing time | kibana-metrics-ui-default-default-hosts_network_out | 28 | ms |
| Min ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 1 | ms |
| Mean ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 5.7039 | ms |
| Median ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 5 | ms |
| Max ML processing time | logs-11558ee526445db2b42eb3d6b4af58d0-log-entry-rate | 153 | ms |
| Min ML processing time | storage_estimate | 0 | ms |
| Mean ML processing time | storage_estimate | 1.31634 | ms |
| Median ML processing time | storage_estimate | 1 | ms |
| Max ML processing time | storage_estimate | 54 | ms |
| Min ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 0 | ms |
| Mean ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 1.53396 | ms |
| Median ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 0 | ms |
| Max ML processing time | edw_agg.fact_home_usage_mly_defaultcounttrend | 124 | ms |
| Total Young Gen GC time | | 148.835 | s |
| Total Young Gen GC count | | 998 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 88982.7 | GB |
| Translog size | | 15.7914 | GB |
| Heap used for segments | | 0 | MB |
| Heap used for doc values | | 0 | MB |
| Heap used for terms | | 0 | MB |
| Heap used for norms | | 0 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0 | MB |
| Segment count | | 72747 | |
| Total Ingest Pipeline count | | 3.25678e+07 | |
| Total Ingest Pipeline time | | 2366.3 | s |
| Total Ingest Pipeline failed | | 0 | |
| Min Throughput | bulk | 1486.25 | docs/s |
| Mean Throughput | bulk | 124939 | docs/s |
| Median Throughput | bulk | 122471 | docs/s |
| Max Throughput | bulk | 166903 | docs/s |
| 50th percentile latency | bulk | 6732.8 | ms |
| 90th percentile latency | bulk | 29417.8 | ms |
| 99th percentile latency | bulk | 36512.5 | ms |
| 99.9th percentile latency | bulk | 40718 | ms |
| 100th percentile latency | bulk | 43921.3 | ms |
| 50th percentile service time | bulk | 6732.8 | ms |
| 90th percentile service time | bulk | 29417.8 | ms |
| 99th percentile service time | bulk | 36512.5 | ms |
| 99.9th percentile service time | bulk | 40718 | ms |
| 100th percentile service time | bulk | 43921.3 | ms |
| error rate | bulk | 50.05 | % |
[WARNING] Error rate is 50.05 for operation 'bulk'. Please check the logs.
Please let me know the reason for this failure