In general we recommend trying to keep the cpu:mem ratio in the 1:8 range (that's a typical ratio you'll see in standard AWS hardware) and we have observed issues with smaller clusters not being able to load with 1:11 or worse (the temp fix is described below and there's a full fix coming in 2.2 or 2.3)
There's a few workarounds:
As you point out you can artificially restrain the allocator capacity .... this makes most sense if you have no control over how the capacity will be used, since in that case you could have 6x 64GB instances all maxing out their CPU
You can disable the hard_limit (set to false) in the advanced cluster data section - this will allow clusters to use all the CPU (and it shares out CPU in the right ratio when the box is overprovisioned)
there is an undocumented API call we've been giving out via support to let you make that the default (in my opinion this is the most desirable configuration for many ECE deployments, ie providing extra performance in return for making it less predictable)
You can keep the hard limit but up the CPU overcommit factor (defaults to 1.2) ... there's a post in here where I describe how to set that, it should be easily searchable.
But if you're not going to be using the capacity then restricting the allocator capacity makes the most sense (especially given licensing costs), and if you are using the capacity then you need a plan for managing over-provisioning anyway
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.