How to re-run failed batch updates

Hello, I am running a heavy load on elastic using bulk-update.

esrally --pipeline=benchmark-only --track=eventdata --track-repository=eventdata --challenge=bulk-update --track-params=bulk_size:30000,bulk_indexing_clients:64 --target-hosts=<my_ip>:9200 --client-options="timeout:640" --kill-running-processes

I have a few bulk rejections.

2020-12-27 17:56:29,368 -not-actor-/PID:57141 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.271s]
2020-12-27 17:59:18,580 -not-actor-/PID:57138 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:2.373s]
2020-12-27 17:59:20,443 -not-actor-/PID:57138 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:1.222s]
2020-12-27 18:04:20,291 -not-actor-/PID:57151 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.511s]
2020-12-27 18:04:21,338 -not-actor-/PID:57151 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.399s]
2020-12-27 18:04:46,494 -not-actor-/PID:57127 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:1.330s]
2020-12-27 18:04:48,158 -not-actor-/PID:57127 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:1.017s]
2020-12-27 18:15:43,399 -not-actor-/PID:57114 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.332s]
2020-12-27 18:15:44,376 -not-actor-/PID:57114 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.312s]
2020-12-27 18:15:45,331 -not-actor-/PID:57114 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.311s]
2020-12-27 18:23:11,123 -not-actor-/PID:57130 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.687s]
2020-12-27 18:23:12,186 -not-actor-/PID:57130 elasticsearch WARNING POST http://my-ip:9200/_bulk [status:429 request:0.405s]

question:

  1. Is there a way to re-run failed bulk updates? Configure a job running with the option to re-run failed bulk operations
  2. I saw in the code for this the use case sequential ids are generated. Will rejected bulks (gap in id sequence generations) affect on potentially heavy load when the index will be large enough. I mean because there is no all id are in a sequence ** the load will not be so high**?

Hi,

This is intentionally not supported out of the box. The intention of a benchmark is to evaluate steady state performance of a cluster and if the benchmark overwhelms the cluster you are in unstable territory. Retries would only skew the picture even more. Having that said, it is possible to implement a so-called custom runner for your own tracks and implement retries there.

I think it makes more sense in that case to reduce the load or increase the cluster capacity to bring the benchmark back to a steady state.

Hope that helps.

Daniel

1 Like