@Christian_Dahlqvist, A good rule-of-thumb is to keep the number of shards per node below 20 per GB heap. A node with a 30GB heap should, therefore, have a maximum of 600 shards. This will generally help the cluster stay in good health.
What this means is if I spin 16C/64GB machine as one node and give 30 GB for heap, I can put maximum 600 shards.
In our case, we just need 100 shards.
Node 1: 16C/64GB: Master
Node 2: 16C/64GB Data node 1, 25 shards = 25 * 20 = 500GB
Node 3: 16C/64GB Data node 2, 25 shards = 25 * 20 = 500GB
Node 3: 16C/64GB Data node 3, 25 shards = 25 * 20 = 500GB
Node 4: 16C/64GB Data node 4, 25 shards = 25 * 20 = 500GB
So, technically 100 shards with a size of 20 GB each on 4 data nodes and 1 master node.
- Is this design feasible? If yes how many indices would be optimum for these 100 shards with this workload?