Proper shard and replica settings

Hello, Elastic!

I'm planning to modify shard and replica settings cause mine is not resonable.

I have 5 types of indices which sized 30MB, 2GB, 25GB, 35GB, 200GB and each of them have 2 sets(live and indexing indices per each).
Now, every indices set as 2 shards and 5 replicas.

As I learned here, each shards should be around 20-50GB.
So I'm planning

  • 30MB, 2GB index: 1 shard
  • 25GB, 35GB index: 2 shards
  • 200GB index: 5 shards
    is it proper?

I heard that the number of replica affect to the search performance, so that, I'm wondering what number of replica is resonable when the service traffic is over 2,000 in general, and over 10,000 per second in a specific period?

Thank you in advance.

The shard and replica count really depends on a lot of factors - including document size, the number and types of queries you'll be running, mappings, etc. Our best, recommended way to know the ideal sharding strategy for your use case is to benchmark your data with the types of indexing and search loads you expect.

You may find our sizing guide helpful as well.

I've read the page you guided several times, however, it is still difficult to decide..

All aside..
If i have 20 nodes and a 30GB-primary index, which can have better search or index performance?
Or at least, what would you choose, if you were me..

  • 2 shards (15GB each) 5 replica = 2 * (5+1) = 12 shards total
  • 4 shards (7.5GB each) 2 replica = 4 * (2+1) = 12 shards total

My question is, what could be the main factor for the better performance between shards and replicas.
Of course, I know that there are so many factors but still I'm facing to choose the balance. :smiling_face_with_tear:

From Elastic Search to Elasticsearch

The linked page includes a section entitled "Aim for shards of up to 200M documents, or with sizes between 10GB and 50GB" which I think answers your question. With 4 shards each shard would be less than 10GiB which is smaller than recommended. 2x15GiB shards would be within the recommended range, but so would 1x30GiB shard.

Almost certainly not this. Or at least, some workloads will get better results with 1x30GiB primary shard and others might do better with 2x15GiB primary shards. There's no way to be sure without benchmarking both setups using your specific workload and data.

Yes, definately. My options were not properly representing the point.
Anyway.. all of you guys said it is up to various factors, I will check that I'm able to set and run Rally for our running service.

Thank you all!