Benchmarking ES using Rally store size metric increase and decrease

I was benchmarking an ES index with different shard numbers using esrally.
to find the shard number with best throughput for the specific index.
As I read in documents store size is index size (translog excluded)
In the report summary of the tests that I performed when the shards count increases I expected to see that store size increases but it increased at first and at some point it suddenly decreased. I wanted to know the reason of such behavior?
The track includes following operations:

Running delete-index [100% done]
Running create-index [100% done]
Running cluster-health [100% done]
Running bulk [100% done]

Race report summary with 1 shard:
Store size 0.0001169 GB

Race report summary with 2 shards:
Store size 5.12749 GB

Race report summary with 3 shards:
Store size 2.22465 GB

Race report summary with 4 shards:
Store size 1.93715 GB

Hi, welcome to the Elastic community, and thank you for your post!

The race report values vary fairly broadly! The described behavior sounds like the normal ES segment merge cycle. Can you share the full race summary reports? I suspect merge throttling is causing it.

Are you using a publicly available Rally track? What are the CPU, memory and storage specs of your target system? I could try reproducing it for a better perspective.

Jason

Thanks, yes the full report is attached


shards =1 vs shards =2

shards = 2 vs shards=3

shards = 3 vs shards = 4

I created the track from my index that I want to find the optimal shard count for it.
Actually I don't understand how segment merges can cause this fluctuation. it would be very helpful for me to know more about these merges and how much storage they may take, are there any specific documents I should read to know more about process of segment merges? Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.