Can you give detailed information about documents.json?

Neslihan · January 16, 2019, 11:33am

What is the purpose of the documents.json file when creating custom track?

What is the difference between documents.json with 1gb and documents.json with 10gb when running the benchmark?

danielmitterdorfer · January 17, 2019, 6:57am

Hi,

please see the Rally docs for full tutorial how to create custom tracks. The file you're mentioning contains the documents that you want to bulk-index to Elasticsearch. It is the document corpus you're operating on in this benchmark and different index sizes cause all sorts of different system behavior, e.g. with larger indices, querying might take longer, page cache behavior might be different (causing less or more load on I/O), you will see different background merge activity in Elasticsearch etc. etc. .

For a general introduction of what to take care of when benchmarking (also from a methodological perspective) I recommend to read the blog post Seven Tips for Better Elasticsearch Benchmarks.

Daniel

system · February 14, 2019, 7:11am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index different source files to different indices? Elasticsearch rally	2	817	April 13, 2018
How Rally will behave with Millions of operations in track.json? Elasticsearch rally	4	589	October 26, 2018
Cannot specify multiple documents in a single corpora Elasticsearch rally	6	93	June 26, 2024
Document size used for benchmarking Elasticsearch	5	705	July 5, 2017
What does "bulk operation" mean exactly in custom track on es-rally? Elasticsearch rally	2	231	April 10, 2024

Can you give detailed information about documents.json?

Related topics