Disk usage on benchmarks

zj27 · July 16, 2021, 1:34pm

Hi all,

I'm studying the Elasticsearch Adhoc Benchmarks and am curious about the disk usage metrics. Why there is a huge gap between final index size (25GB) and total bytes written(313GB)?

Christian_Dahlqvist · July 16, 2021, 6:16pm

Lucene uses immutable segments to store data and these are created reasonably small and then merged into larger ones. This means that the same data is written to disk multiple times as more data is added to the index and merging creates larger and larger segments.

system · August 13, 2021, 6:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch index size is less than it takes on disk Elasticsearch	4	2822	December 7, 2017
Lucene segments memory is an order of magnitude smaller than expected Elasticsearch	1	371	July 5, 2017
What does "Indices Lucene memory" mean? Using a large amount of heap space Elasticsearch	17	6629	July 5, 2017
Index data on disk between versions Elasticsearch	7	355	May 31, 2021
File System Cache Vs JVM Elasticsearch	6	2546	August 3, 2019

Disk usage on benchmarks

Related topics