[ERROR] Cannot race. Worker [22] has exited prematurely

Hi,
I am getting the error ------->[ERROR] Cannot race. Worker [22] has exited prematurely.
Following is setup and configuration

  • Elastic Search ver 8.5.2
  • ESRally ver 2.7.0
  • 5 node Kubernetes cluster with 1 controller , 1 Rally benchmark pod, 1 Elasticsearch master pod , 2 Elasticsearch Data pods i.e 4 worker nodes.
  • num of vcpus : ( Rally node - 128 vcpus, ES- Master 8 vcpus, ES data nodes - 8 vcpus)
  • num of shards = 4
  • Bulk size = 10,000
  • Bulk indexing client = 128
  • Translog flush = 4g
  • codec = LZ4
  • MAX MERGE COUNT = 4.

I am seeing this error for bulk indexing client =128 but for lower values of bulk indexing client like 8,16 or 32 not seeing this error.

Kindly some one help me with it.

nyc_taxis_params.json  nyc_taxis https://es-master:9200  basic_auth_user:rally,basic_auth_password:changeme,timeout:240,use_rue,verify_certs:false,ca_certs:/rally/cacert.pem
4 0 100 20000 128 -1 append-no-conflicts-index-only

    ____        ____
   / __ \____ _/ / /_  __
  / /_/ / __ `/ / / / / /
 / _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
                /____/

[INFO] Decompressing track data from [/rally/.rally/benchmarks/data/nyc_taxis/documents.json.bz2] to [/rally/.rally/benchmarta/nyc_taxis/documents.json] (resulting size: [74.32] GB) ... [OK]
[INFO] Preparing file offset table for [/rally/.rally/benchmarks/data/nyc_taxis/documents.json] ... [OK]
[INFO] Race id is [6fe6cad6-1145-4d58-a307-07dd3e3adbfc]
[WARNING] Could not terminate all internal processes within timeout. Please check and force-terminate all Rally processes.
[ERROR] Cannot race. Worker [10] has exited prematurely.

Getting further help:
*********************
* Check the log files in /rally/.rally/logs for errors.
* Read the documentation at https://esrally.readthedocs.io/en/2.7.0/.
* Ask a question on the forum at https://discuss.elastic.co/tags/c/elastic-stack/elasticsearch/rally.
* Raise an issue at https://github.com/elastic/rally/issues and include the log files in /rally/.rally/logs.

---------------------------------
[INFO] FAILURE (took 215 seconds)
---------------------------------

@Utkarsh17 Hello! Thank you for using Rally.

It looks like there could possibly be processes leftover from another Rally run on your system. Can you try executing Rally with --kill-running-processes? The --test-mode flag also can be used for fast iteration and troubleshooting.

If you have further questions, could you please include the full rally.log?