Proper shard and replica settings

HyebinHong · July 3, 2024, 8:07am

Hello, Elastic!

I'm planning to modify shard and replica settings cause mine is not resonable.

I have 5 types of indices which sized 30MB, 2GB, 25GB, 35GB, 200GB and each of them have 2 sets(live and indexing indices per each).
Now, every indices set as 2 shards and 5 replicas.

As I learned here, each shards should be around 20-50GB.
So I'm planning

30MB, 2GB index: 1 shard
25GB, 35GB index: 2 shards
200GB index: 5 shards
is it proper?

I heard that the number of replica affect to the search performance, so that, I'm wondering what number of replica is resonable when the service traffic is over 2,000 in general, and over 10,000 per second in a specific period?

Thank you in advance.

Kathleen_DeRusso · July 3, 2024, 12:10pm

The shard and replica count really depends on a lot of factors - including document size, the number and types of queries you'll be running, mappings, etc. Our best, recommended way to know the ideal sharding strategy for your use case is to benchmark your data with the types of indexing and search loads you expect.

You may find our sizing guide helpful as well.

HyebinHong · July 8, 2024, 1:52am

I've read the page you guided several times, however, it is still difficult to decide..

All aside..
If i have 20 nodes and a 30GB-primary index, which can have better search or index performance?
Or at least, what would you choose, if you were me..

2 shards (15GB each) 5 replica = 2 * (5+1) = 12 shards total
4 shards (7.5GB each) 2 replica = 4 * (2+1) = 12 shards total

My question is, what could be the main factor for the better performance between shards and replicas.
Of course, I know that there are so many factors but still I'm facing to choose the balance.

dadoonet · July 8, 2024, 3:47am

From Elastic Search to Elasticsearch

DavidTurner · July 8, 2024, 5:10am

The linked page includes a section entitled "Aim for shards of up to 200M documents, or with sizes between 10GB and 50GB" which I think answers your question. With 4 shards each shard would be less than 10GiB which is smaller than recommended. 2x15GiB shards would be within the recommended range, but so would 1x30GiB shard.

Almost certainly not this. Or at least, some workloads will get better results with 1x30GiB primary shard and others might do better with 2x15GiB primary shards. There's no way to be sure without benchmarking both setups using your specific workload and data.

HyebinHong · July 8, 2024, 5:19am

Yes, definately. My options were not properly representing the point.
Anyway.. all of you guys said it is up to various factors, I will check that I'm able to set and run Rally for our running service.

Thank you all!

system · August 5, 2024, 5:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Correct number of shards for 5.3 TB indices Elasticsearch	10	2165	May 18, 2017
Trying to optimize Elasticsearch cluster Elasticsearch	3	976	February 20, 2017
Recommanded shards size includes replicas or not? Elasticsearch	3	217	October 25, 2021
How many shards do I need to have? Elasticsearch	5	476	May 12, 2019
How to chose the number of shards and replica Elasticsearch	3	314	October 19, 2020

Proper shard and replica settings

Related topics