I've seen some confusing (seemingly contradicting) official guidelines about sizing an ElasticSearch cluster, so I'd like to know if somebody can shed some light on this.
These are the guidelines in question:
As per the ES sizing and capacity planning webinar (minute 0:49:16) we should keep the memory-disk ratio our hot data nodes at 1:30. So one hot node with 64GiB of RAM should have a disk no larger than 1920 GiB.
As per the blog entry "How many shards should I have in my ES cluster:
- ... it is common to see shards between 20GB and 40GB in size; and:
- ... a good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards
And here comes what I think is a contradiction, suppose a machine with 64GiB of RAM (30GiB of Java heap). According to the first recommendation, the disk should be at most 1920GiB. If I make each shard 30GiB as per the second recommendation, I could only fit 64 of them in a single node, but the 3rd recommendation says ES supports 600 shards on that same node with 64GiB of RAM. So there is a mismatch of about 9x about how many shards I can put in that node.
I understand these are guidelines, not exact math, but they suggest things so different that it's not possible to follow them.
So, my questions are:
- Under which situation can I put 600 shards on a node if the ideal shard size is around 30GiB, which means I need a disk of 18TiB, which goes way above the 1:30 ratio?
- In the 1:30 ratio, is the ratio really "Physical RAM" vs. "Physical Disk"? Or should it be "Physical RAM" vs "Disk used by shards"? Because if you notice, all those disk measures are not counting for leaving 20% of the disk empty (as recommended in other guidelines).
- Shouldn't the 1:30 ratio consider the size of each shard? It's not the same putting in the disk a 30 shards of 20GiB vs. 15 shards of 40GiB, as the total shard count affects how much heap is used.