How to allocate ES data shards among 20 es data nodes effectively?

Dear folks,

Current we have total 26 nodes in an ES cluster, 3 masters, and 2 clients, 20 data node.
We have 16 cores,32GM mem, HDD raid5 4.6T, 1000Mbps NIC for each server of es data nodes.
We want to store all indexes among all data nodes In order to make better use of all server computing performance and save data for a long time.

My question is that what's the best sharding solution among the all 20 es data nodes? (also allow 1 or 2 data nodes can be down in terms of
network/hard etc failure)..

20 primary shards, 1 replicas
10 primary shards, 2 replicas.
19 primary shards, 1 replicas

Any good suggestions are appreciated!!!

Regards

Robin

10 primaries with 1 replica set would mean 1 shard per node, which is pretty evenly distributed.

Though if your shards are small it may be a bit of a waste.

Dear @warkolm

seems with 10 primaries and 1 replica, We just have a half-disk space utilization of whole es data nodes.
If we total have 100TB space within 20 ES nodes, which is we just can use 50TB for the primary shard, and rest of disk space for replicas. but the problem is that probably our indexes over 50 TB a year.

what about 20 primaries and 1 replica per index?

Thanks

Any suggestions?

Regards

The sharding scheme depends on the use case, so more details about your type of data and expected volumes would help? Will you be using time-based indices? If so, have you read this blog post about sharding practices?

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.