Rally consumes all cluster threads and crashes with small clusters

danielmitterdorfer · January 30, 2018, 10:46am

Internally, Rally uses the default Elasticsearch Python client and it uses that client to issue bulk requests. By default, Rally will issue requests as fast as it can. In your case (geonames), Rally will use a bulk size of 5000 docs/s and 8 clients. Rally cannot "know" what you want to measure and thus does not have any backoff logic like a normal client would do (and sometimes this is handy, see e.g. the blog post Why am I seeing bulk rejections in my Elasticsearch cluster?).

There is also a mode in Rally where you can define a target throughput and Rally will aim to achieve it (that depends whether Elasticsearch can achieve that throughput), see also the Rally FAQ. It is primarily meant for benchmarking operations where you're interested in a specific latency (e.g. searches) instead of batch operations (e.g. bulk indexing).

In your case you are probably interested in finding the breaking point and want to avoid bulk rejections. I suggest two things:

You can change the bulk size of the track to e.g. 500 documents with --track-params="bulk_size:500" (see the geonames track README). We do not expose the number of indexing clients yet as parameter although it would be possible.
Bulk rejections (and any other errors) get recorded by Rally and if you use a dedicated metrics store you can inspect those in more detail. However, in your case I have the impression that you want to treat a bulk rejection as a fatal error and thus you could add the parameter --on-error=abort so Rally will treat any HTTP error as fatal and abort the benchmark immediately.

Topic		Replies	Views
A question for result benchmark Elasticsearch rally	2	791	March 27, 2017
Bulk-update challenge tuning Elasticsearch rally	3	421	December 17, 2020
ThreadPool Setting's for bulk indexing in elasticsearch.yml Elasticsearch	5	8606	July 5, 2017
Bulk api queue becomes full Elasticsearch	4	3694	July 5, 2017
CPU utilization is too low while indexing Elasticsearch rally	6	1200	May 24, 2019

Rally consumes all cluster threads and crashes with small clusters

Related topics