Hi all, I want to use Rally to benchmark the performance of deleting documents instead of deleting an index. Like, delete all documents from that specific day.
I know elasticsearch support that by using delete_by_query API, however, I didn't find out whether rally supports this kind of feature.
Any clues or links will be very appreciated! Thanks in advance.
Hello @gloriacs, Rally focuses on operations representing Elasticsearch APIs for indexing, search, and other related APIs. Out of the box, Rally does not define an operation for deleting individual documents. To do this, create a custom runner from the documented example to define your operation.
Deleting a lot of documents using delete-by-query is a single long-running operation. If I was to try to quantify the impact I would use Rally to add the ongoing indexing and query load and then trigger a delete-by-query at a specific point in time and see how the indexing and query operations executed by Rally are affected. I would not try to get Rally to run the delete-by-query as you are not necessarily interested in how long it takes. Make sure you use a metrics store so you can plot the performance data and spot changes in behaviour over time.
Thanks @json and @Christian_Dahlqvist. I will create a custom runner and quantify the impact. Both solutions are good, and thanks for the quick response.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.