Google Cloud, AWS, and others are now offering nodes with 2TB or more of memory. In the past, the conventional wisdom has been to not provide Elasticsearch with more than 32GB so it uses 32 bit pointers, but I wonder if this conventional wisdom still holds with these larger memory machines.
Has the Elasticsearch team (or anyone), experimented running a cluster on very large machines? The question comes down to this... which is the better setup:
- 1 node with 2TB of RAM (half to heap, half to disk cache)
- 32 nodes with 64GB of RAM (half to heap, half to disk)
(assume the same number of CPUs). At some point the network overhead of so many nodes, will overtake the overhead due to 64-bit pointers. In addition, there is likely better disk caching when all the data is localized on a single large machine.
I am curious if anyone has tested this, or similar large node setups.