Memory size settings

Couple of quick questions:

The default setting is for the memory to be allocated outside of the JVM
heap. Is there a limit, or will ElasticSearch simply take as much as
possible?

The node filter cache size is tune-able. Is all the memory allocated up
front or is the setting an upper limit? With large heap sizes, the default
of 20% would be too high if it is dedicated toward only the filter cache.

Field cache: part of the heap or part of the memory allocated outside of
the JVM?

If the field cache and memory are outside of the JVM, what would be the use
of a large heap size on high-memory nodes?

mmap/mlockall: yay or nay? So far testing has shown little difference with
mlockall. Had a previous bad experience with mmap on a Lucene instance
(long commits).

Cheers,

Ivan

--

Hi Ivan,

On Wednesday, November 21, 2012 9:15:31 PM UTC+1, Ivan Brusic wrote:

Couple of quick questions:

The default setting is for the memory to be allocated outside of the JVM
heap. Is there a limit, or will Elasticsearch simply take as much as
possible?

If you refer to direct memory, the default is unlimited (Long.MAX_VALUE)
unless you set ES_DIRECT_SIZE in bin/elasticsearch.in.sh. The Java JVM will
try to consume the memory as long as your OS allows to - it depends. See
also http://docs.oracle.com/javase/6/docs/api/java/lang/Runtime.html#maxMemory()

The node filter cache size is tune-able. Is all the memory allocated up
front or is the setting an upper limit? With large heap sizes, the default
of 20% would be too high if it is dedicated toward only the filter cache.

The node filter cache fills as the filter requests come in and demand for
being cached. 20% is the limit, if it is exceeded, the cache will be
evicted.

Field cache: part of the heap or part of the memory allocated outside of
the JVM?

Part of the heap.

If the field cache and memory are outside of the JVM, what would be the
use of a large heap size on high-memory nodes?

Large heaps hold Java objects (of all kind) for a long time in RAM, without
being processed by frequent GC, or by intervention of slow memory access
(external storage, disks).

mmap/mlockall: yay or nay? So far testing has shown little difference with
mlockall. Had a previous bad experience with mmap on a Lucene instance
(long commits).

On 64bit machines with large virtual adress space, yay - mmap is the
default. mmap has some advantages, because Java (Lucene) can perform
file-based reading a lot faster, since the OS can map the file buffer
directly into the virtual address space. For writing (commits), mmap does
not help. If things went wrong with mmap, there is less OS cache available
for writing files.

mlockall is an advanced option. If your ES and all other processes take
much resources, so that your machine starts to swap (i.e. paging memory to
disk), mlockall can force the ES process to stay in RAM and still run fast
while other processes may not. mlockall has neglectable effect if you have
lots of RAM, or the OS does not swap, or there is no other process that
takes much resources.

There was a glitch in the ES code that prevented mmapfs for some 0.19.x
versions, but it should do now.

Cheers,

Jörg

--

Hi Jörg,

Responses inline.

On Wed, Nov 21, 2012 at 5:11 PM, Jörg Prante joergprante@gmail.com wrote:

If you refer to direct memory, the default is unlimited (Long.MAX_VALUE)
unless you set ES_DIRECT_SIZE in bin/elasticsearch.in.sh. The Java JVM
will try to consume the memory as long as your OS allows to - it depends.
See also
http://docs.oracle.com/javase/6/docs/api/java/lang/Runtime.html#maxMemory()

Did not know about the ES_DIRECT_SIZE setting. It should be documented
(perhaps I will contribute once I fully comprehend all the bells and
whistles). Can direct memory cause out of memory exceptions if it is
outside of the heap or will to OS not allow it?

The node filter cache fills as the filter requests come in and demand for

being cached. 20% is the limit, if it is exceeded, the cache will be
evicted.

I looked at the source code and saw the caching is done via Guava's caching
system. I am familiar with it, so now I understand exactly what is going on
behind the scenes.

Field cache: part of the heap or part of the memory allocated outside of

the JVM?

Part of the heap.

That was always my assumption, just wanted to know concretely. Once again,
looking at the source code provided me with the answer (in addition your
response of course).

If the field cache and memory are outside of the JVM, what would be the

use of a large heap size on high-memory nodes?

Large heaps hold Java objects (of all kind) for a long time in RAM,
without being processed by frequent GC, or by intervention of slow memory
access (external storage, disks).

I am quite aware of how heaps work. :slight_smile: The question was more one of what
objects are stored by the heap? The total cache sizes (field + filter) are
only several gigabytes each. How much more overhead is needed by
Elasticsearch? Making up some numbers (not in front of my app right now):
if the field cache is around 8GBs and the filter cache around 4GBs, would
having a 32GB heap on a 64GB node make sense? Will Elasticsearch
effectively use the other 20GB (32-12) of heap? Might be more effective to
have a smaller heap and give the rest to the direct memory.

On 64bit machines with large virtual adress space, yay - mmap is the

default. mmap has some advantages, because Java (Lucene) can perform
file-based reading a lot faster, since the OS can map the file buffer
directly into the virtual address space. For writing (commits), mmap does
not help. If things went wrong with mmap, there is less OS cache available
for writing files.

Isn't the default NIO on 64bit Linux?

mlockall is an advanced option. If your ES and all other processes take
much resources, so that your machine starts to swap (i.e. paging memory to
disk), mlockall can force the ES process to stay in RAM and still run fast
while other processes may not. mlockall has neglectable effect if you have
lots of RAM, or the OS does not swap, or there is no other process that
takes much resources.

There was a glitch in the ES code that prevented mmapfs for some 0.19.x
versions, but it should do now.

That pretty much confirms my findings. Saw no performance gain, but my main
motive is create a more fault-tolerant system that has a reduced chance of
crashing due to CPU/memory pressures. Elasticsearch is the only process
beside monitoring and minor scripts.

Currently the app is more CPU constrained than memory which leads me to
believe I am not utilizing memory as efficiently as possible. The field
cache hits its upper limit rather quickly despite there being no explicit
field cache limit.

Cheers,

Ivan

--

Hi Ivan,

On Thursday, November 22, 2012 8:13:34 PM UTC+1, Ivan Brusic wrote:

Did not know about the ES_DIRECT_SIZE setting. It should be documented
(perhaps I will contribute once I fully comprehend all the bells and
whistles). Can direct memory cause out of memory exceptions if it is
outside of the heap or will to OS not allow it?

it's a recently introduced flag and a setting which is not really required,
it maps to -XX:MaxDirectMemorySize for the JVM. I guess the ES JVM can
settle for less memory for NIO by manipulating this parameter.

The memory outside of the heap of the JVM is also garbage collected, so, an
OOM because of lack of memory for NIO is possible. But with the default
settings, the JVM takes really good care of it.

The question was more one of what objects are stored by the heap? The
total cache sizes (field + filter) are only several gigabytes each. How
much more overhead is needed by Elasticsearch? Making up some numbers (not
in front of my app right now): if the field cache is around 8GBs and the
filter cache around 4GBs, would having a 32GB heap on a 64GB node make
sense? Will Elasticsearch effectively use the other 20GB (32-12) of heap?
Might be more effective to have a smaller heap and give the rest to the
direct memory.

I recommend tools like bigdesk for visualizing the heap allocation over
time. There is no easy answer, simply because there are a lot of possible
workloads. Just to name a few. If you are writing more data into the index
than reading, you can calculate the maximum batch size of the data in the
bulk API for the heap and tune the slowest part of the system (it's the I/O
disk subsystem). If you have a high load of simple queries and small
result sets, you could be interested in moving all the Lucene index files
into RAM so response times are fast, probably by using mmap/mlockall. If
you have an analytical workload - long running queries, each of theme
generating huge result sets, with lots of filters and facets - you should
take care of large heaps first and get the heap filled most of the time.
All these workloads can also be mixed, there is a "sweet spot" between heap
reservation and other memory, but it's hard to predict, and not
surprisingly, it may change over time while your application runs.

ES can run with large heaps, very large heaps of even >8GB, although the
current standard JVM versions will have a tough time to handle
it. Generally, ES does not clutter the heap with lots of small objects, so
GC times will stay in an acceptable range. If you configure a 32GB heap,
you can run the warmer API or some excessive facet constructions to get the
heap filled at startup time. Not surprisingly this will take some seconds
or even up to minutes. And be aware not to destroy such large caches
randomly by index writes, because they have to be build again in such
cases, which is getting the more expensive the larger the cache is.

Isn't the default NIO on 64bit Linux?

Yes, in org.elasticsearch.index.store.IndexStoreModule, there is a
heuristic: default is niofs, except for Windows 64bit and Solaris 64bit
with mmap() support, then it's mmapfs; and other Windows (32bit) is
simplefs.

I would rather use mmapfs for Linux 64bit also, but in the current 0.20, it
is not the default.

Currently the app is more CPU constrained than memory which leads me to
believe I am not utilizing memory as efficiently as possible. The field
cache hits its upper limit rather quickly despite there being no explicit
field cache limit.

Maybe it helps to play with the cache expiration policy, or the cache entry
type: Elasticsearch Platform — Find real-time answers at scale | Elastic

"resident" is the default and tends to get full utilization and OOM early,
so "soft" and "weak" are an option to help the garbage collector when
memory gets tight. They correspond to the Java SoftReference and
WeakReference classes.

Best regards,

Jörg

--