JVM Heap Advice from documentation

truekonrads · February 25, 2019, 12:28pm

Hello,

In the Setting the heap size documentation page, the advice for setting the heap is as follows:

less than 50% of physical ram because of kernel caches
less than 32 or even 26 GB because of Compressed OOPS

However, the (1) is an odd recommendation: if ES is the primary application on the system, then surely, it would benefit from larger heap so it can keep data in its own cache as opposed to rely on kernel caching disk access? Also, why 50%; is it just an old wive's tale or there is some science behind it?

DavidTurner · February 25, 2019, 1:09pm

Elasticsearch mostly doesn't have its "own cache" for on-disk data. Modern operating systems do a very good job of this already, so it makes a good deal of sense to rely on that.

Perhaps a poor choice of words: the phrase "old wives' tale" kinda excludes from this discussion those of us who are old wives.

In this case there is indeed science behind this guideline. Elasticsearch heavily uses off-heap data, and AIUI the limit on off-heap data is by default equal to the heap size, so to allow all of the memory it needs to be allocated in RAM you should set the heap size to no more than half of the total RAM.

truekonrads · February 26, 2019, 10:13am

OK, so why not ask the reverse question - if heap usage isn't that important to ES beyond certain memory point (say, ability to hold search result set in memory before transmission), why not allocate it less memory, say 25%? Wouldn't that improve performance as now more memory is available for kernel disk cache and off-heap data.

DavidTurner · February 26, 2019, 10:54am

Note that the documentation gives 50% as a recommended upper limit, not a target:

Set Xmx to no more than 50% of your physical RAM

You can of course set it lower. I think that reducing the heap size also reduces the space available for off-heap data (they share a limit) but you are right that it increases the space available for filesystem cache. Would that improve performance? It depends™ It means a full GC would be faster, but maybe more frequent, and means it can cache less data internally(*). With some workloads this could be a net win. Only proper benchmarking can tell.

(*) Elasticsearch relies on the filesystem cache for fast access to on-disk data but it does have its own caches for other data that isn't kept on-disk.

system · March 26, 2019, 11:02am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to set the heap-size reasonably Elasticsearch	1	350	October 24, 2019
Why the JVM heap of ES is always about 50%? Elasticsearch	17	2282	September 4, 2017
How much memory to allocate to heap? Elasticsearch	3	544	July 6, 2017
Elasticsearch ram size Elasticsearch	12	3203	November 10, 2021
Heap Allocation for Machines with 100GB+ Ram Elasticsearch	5	938	July 6, 2017

JVM Heap Advice from documentation

Related topics