Hello, I run the bulk-update challenge:
esrally --pipeline=benchmark-only --track=eventdata --track-repository=eventdata --challenge=bulk-update --track-params=bulk_size:10000,bulk_indexing_clients:64 --target-hosts=10.145.14.241:9200 --client-options="timeout:240" --kill-running-processes
we are quite happy with the results however tests run 7 days. Data is stored in elastic.
Questions:
1. having data populated into elastic, will it be possible to reduce the time execution for 6 hours or 1 day and got the performance of the last day load. I know to control the number of cycles in bulk-update.json
- from the execution It shows the 1 phase is most operations are writes but the majority of the load comes from the reads. I tried to find how it works in the code but it is not clear to me where the switch from write to read is happening.
https://github.com/elastic/rally-eventdata-track/blob/master/eventdata/parameter_sources/elasticlogs_bulk_source.py
- Also it looks like the rally client executions are adapted based on response time. In my case rally machine has 64 CPUs and utilization up to 10%, and it is around 130 MB per / second how can I accelerate thruput from rally client