I'm running a local/single-node esrally. My esrally version is 2.30. When I run the below command the execution hangs while checking the status of the cluster -- i.e. in check-cluster-health.
Checking the health gives a green result:
$ curl http://localhost:39200/_cat/health?v
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1638996487 20:48:07 rally-benchmark green 1 1 2 2 0 0 0 0 - 100.0%
But, what I see in the rally.log is:
2021-12-08 20:47:15,330 -not-actor-/PID:92289 Elasticsearch WARNING GET http://127.0.0.1:39200/_cluster/he alth/nyc_taxis?wait_for_status=green&wait_for_no_relocating_shards=false [status:408 request:30.003s]
Other similar questions couldn't fix the issue for me.
Also, make sure there are no other process leftovers from other esrally executions.
Can you share more of rally.log, preferably the complete file and any customized configurations?
GET /_cluster/health/<index> can behave this way for non-existing indices or if the index will never be green. _cat/health returns green because it represents the status of all known indices in the cluster.
Some possibilities:
The nyc_taxis index exists but is yellow. This will happen if the number of index replicas has been configured to be >0 for a single node cluster.
$ curl -X GET http://127.0.0.1:39200/_cluster/health/nyc_taxis
{"cluster_name":"rally-benchmark","status":"red","timed_out":true,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":0,"active_shards":0,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}
The first time I run esrally, there is no issue. But when I try to re-run again (after a clean completion of the first run), the issue comes. Currently, I'm using the default configuration and not changing anything. That is, to repeat the problem, what I have to do is run it twice.
Since running the whole nyc_taxis challenge is slow on my laptop, I added --test-mode to your commands, and I fail to reproduce the issue. Do you also have the issue when you add --test-mode?
Also, what do you call "the first run"? What do you need to reset to get into "first run" conditions?
Is there a reason why you could not share your Rally configuration and logs, as requested by Jason?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.