Experience with Large Memory Nodes (1TB, 2TB, and more)

Google Cloud, AWS, and others are now offering nodes with 2TB or more of memory. In the past, the conventional wisdom has been to not provide Elasticsearch with more than 32GB so it uses 32 bit pointers, but I wonder if this conventional wisdom still holds with these larger memory machines.

Has the Elasticsearch team (or anyone), experimented running a cluster on very large machines? The question comes down to this... which is the better setup:

  • 1 node with 2TB of RAM (half to heap, half to disk cache)
  • 32 nodes with 64GB of RAM (half to heap, half to disk)

(assume the same number of CPUs). At some point the network overhead of so many nodes, will overtake the overhead due to 64-bit pointers. In addition, there is likely better disk caching when all the data is localized on a single large machine.

I am curious if anyone has tested this, or similar large node setups.

Google Cloud Pricing

I don't think so, or at least I haven't heard of any. As well as the question about pointer sizes, I expect the effects of NUMA would become more pronounced with such enormous nodes.

I'd recommend also considering setups with a total heap size much smaller than 1TiB, allowing more data to fit in disk cache.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.