Consider the following situation: the node has 100 GB of RAM, but the ES heap size is 30 GB.
Does this mean that ES will only ever use 30 GB of RAM? And the other 70 GB will go unused?
If not, than what how does it use the memory.
P.S. I see lots of articles on setting the correct size of the heap, but can't find anything that describes what it is and how it works in relation to ES.
I think that's not quite the case. It's normal to see the Elasticsearch process using more memory than the configured heap size (usually not more than 2×). It also indirectly uses the rest of the available memory in the system via the filesystem cache. A node with 100GB of memory may well perform better than a node with 64GB of memory even though Elasticsearch's heap size is 30GB in both cases. The filesystem cache is very important. The extra memory is not wasted.
I'm not sure I understand the distinction. The OS provides the RAM that the JVM uses too, both for the heap and for everything else. Your original question was whether this extra memory was used or not, and the answer is that it is indeed used.
Elasticsearch (really Lucene) puts a lot of effort into accessing files in a way that increases the chances that the data it needs is already cached, and a larger filesystem cache can make that much easier.
David, what I mean is this. Consider the following pseudo code:
var content = FileRead("foo.txt");
All the application (ES) did was read the contents of the file into a variable - and RAM was allocated for the application. The OS, behind the scenes, may have placed foo.txt into RAM (e.g. filesystem cache) so that next time FileRead is called, the disk doesn't have to be accessed. That RAM isn't directly used by the application, but by the OS.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.