Elasticsearch does not recommend setting memory limit over 32GiB due to a pointer optimization called "compressed oops" that only happens when the heap size is under 32GiB. (1).
However, I have found little discussion on exactly how much of a performance impact this will have on CPU and memory. Does anyone have insights into this? Couldn't find detailed analysis on this anywhere.
This is my understanding others could probably give you deeper analysis.
In short when you pass the compressed object pointers the the sane amount of objects that fit in the heap takes much more room.. so you use much more heap with no benefit... Then to actual benefit you have to increase the heap size much larger... The Garbage collection can take a very long time reducing performance...
Is there a use case to go bigger perhaps.. for very very specific reasons and for only THE most experienced elasticsearch users
In general this is not generally a case where bigger is better.
The extra ram can still help not in the JVM but with the File system cache etc.
My situation is that I have a choice between using a few tall nodes with a large amount of memory (12 nodes with 256GB memory), or using more smaller nodes (24-36 nodes with 64GB memory) and I'm weighing between the two, as I'm not sure if having too many nodes will impact performance.
From what I'm hearing, I think I should use smaller nodes.
Elasticsearch is built distributed from the ground up... More nodes 24-36 nodes not a problem... And probably more resilient...
Elastic Cloud run 10s of thousands of clusters for our customers from single node to 100+ nodes ... Not a single node is larger that 64gb host... With proper sized JVM
As with all questions of performance the only way to get an accurate answer is to run your own benchmarks using real data and real workloads.
However, I would recommend considering running 12 nodes with 31GiB (or less) of heap, leaving the remaining 225GiB (or more) of memory on each node for the filesystem cache. For many workloads ES doesn't need that much JVM heap, but filesystem cache is crucial for performance.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.