ES&Lucene 32GB heap myth or fact?

Pawel_Rog · March 30, 2015, 9:30am

Hi,

On Friday, March 27, 2015 at 5:02:05 PM UTC+1, Aaron Mefford wrote:

I think part of what you may be missing, is the intent that Elasticsearch
be scaled out rather than up. There are other issues that occur when you
scale up instead of out, the first of which is that losing a single node of
your cluster can be disastrous. It is also generally far more expensive to
scale up than scale out.

I don't mean that I doubt in scaling out. This topic touches the "magic
line" of 32G heap size, not replacing multiple nodes with a single node.
Please remember that you can still have 50 machines of 40G heap for
example. And please also don't forget that distributing has also some
drawbacks.

That said I am interested in this as it is increasingly common to have
128GB or 256 GB in a typical enterprise machine, that didn't break the
bank.

If I had access to such machines I would run some benchmarks to show the
differences. What does memory utilization look like after ingesting a
large number of docs or with a given query mix.

One other note, these cautions are not unique to Elasticsearch, they are
made with SOLR as well. I do know that it the impact of GC in large heaps
is very real, and a very powerful cluster can fall apart if things are not
tuned well, when they do run out of memory.

On Friday, March 27, 2015 at 2:36:00 AM UTC-6, Jörg Prante wrote:

The statement "It wastes memory, reduces CPU performance, and makes the
GC struggle with large heaps." reads like there is a catastrophe waiting
and is a bit overstated. It may waste memory usable by the JVM heap, true.
But it does not reduce CPU performance - OOP with LP64 is exercising memory
and cache bandwith, not CPU. And "GC struggle" is alone to the method how
GC works with heap objetcs - not related to OOP. In fact, GC is a bit
slower with compressed OOP because of the overhead of encoding/decoding
addresses.

Jörg

On Fri, Mar 27, 2015 at 8:53 AM, Paweł Róg pro...@gmail.com wrote:

Hi,
Exactly, ES is optimized to use large objects (arrays of primitives).
This makes me think that documentation sometimes can be misleading. You can
see a bunch of places where "magic line" which shouldn't be crossed really
appear:

Heap: Sizing and Swapping | Elasticsearch: The Definitive Guide [2.x] | Elastic

Limiting Memory Usage | Elasticsearch: The Definitive Guide [2.x] | Elastic

--
Paweł Róg

On Thursday, March 26, 2015 at 6:08:26 PM UTC+1, Jörg Prante wrote:

I will not doubt your numbers.

The difference may depend on the application workload, how many heap
objects are created. ES is optimized to use very large heap objects to
decrease GC overhead. So I agree the difference for ES may be closer to
0.5 GB / 1 GB and not 8 GB.

Jörg

On Thu, Mar 26, 2015 at 4:44 PM, Paweł Róg pro...@gmail.com wrote:

Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my
last e-mail. What I wanted to point out is that IMHO in ES I can get
something different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my
analysis for 16G reachable objects (with Xmx 30G) from my calculations the
overhead of disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap
usage (30G from Xmx 30G) it would be 1G. This means that 30G heap will be
always less than eg. 32G or 33G heap in case of ES (at least for my query
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any
mistake during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:

There is no "trouble" at all, only a surprise effect to those who do
not understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work
silently. The challenge is, large object pointers waste some of the CPU
memory bandwith when JVM must access objects on a 64bit addressable heap.
There is a price to pay for encoding/decoding pointers, and that is
performance. Most people prefer memory efficiency over speed, so current
Oracle JVM is now enabling compressed OOPs by default. And this feature
works only on heaps less than ~30GB. If you configure a larger heap (for
whatever reason) you lose compressed OOP feature silently. Then you get
better performance, but with less heap object capacity. At a heap size of
~40G, you can again store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg pro...@gmail.com wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-
java-jvm-memory-oddities/). Lets ask ourselves a question if this
is a very big problem for ES&Lucene too.

I analyzed a few heap dumps from ES. The maximum size of the heap
was set below magical boundary (Xmx was 30GB). In all cases I can see
similar pattern but let's discuss it based on a single example. One heap
dump I took had around 16GB (slightly more) of reachable objects in it.
There were about 70M objects. Of course I cannot just take 70M to see how
much of the heap I can save by having OOPs enabled but I also tried to
analyze the number of references to objects (because some objects are
referenced multiple times from multiple places). This gave me a number
around 110M inbound references so OOPs let us save about 0.5GB of memory so
when we try to estimate, this would mean around 1GB when whole the heap
is currently in use (as I wrote earlier only 16GB of reachable objects were
in heap) - for analyzed case. Moreover I can observe this:

2M objects of type long which take 6G of heap
280K objects of type double which take 4.5G of heap
10M objects of type byte which take 2.5G of heap
4.5M objects of type char which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays
pointed by less than 20M references. As we can see ES&Lucene use a lot of
arrays of primitives.

Elasticsearch is very "memory-hungry" especially when using
aggregations, multi-dimensional aggregations and parent-child queries. I
think sometimes it is reasonable to have a bigger heap if we have enough
free resources.

Of course we have to remember that the bigger heap means more work
for GC (and currently used in JVM: CMS or G1 are not very efficient for
large heaps), but ... Is there really a magical line (32GB) after crossing
we get into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXb
dzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ebf7a8b9-bf05-4ab2-8a06-6e2dded3cf7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.