ES&Lucene 32GB heap myth or fact?

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/).
Lets ask ourselves a question if this is a very big problem for ES&Lucene
too.

I analyzed a few heap dumps from ES. The maximum size of the heap was set
below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around 1GB when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays pointed by
less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

Elasticsearch is very "memory-hungry" especially when using aggregations,
multi-dimensional aggregations and parent-child queries. I think sometimes
it is reasonable to have a bigger heap if we have enough free resources.

Of course we have to remember that the bigger heap means more work for GC
(and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

1 Like

There is no "trouble" at all, only a surprise effect to those who do not
understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work silently.
The challenge is, large object pointers waste some of the CPU memory
bandwith when JVM must access objects on a 64bit addressable heap. There is
a price to pay for encoding/decoding pointers, and that is performance.
Most people prefer memory efficiency over speed, so current Oracle JVM is
now enabling compressed OOPs by default. And this feature works only on
heaps less than ~30GB. If you configure a larger heap (for whatever reason)
you lose compressed OOP feature silently. Then you get better performance,
but with less heap object capacity. At a heap size of ~40G, you can again
store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg prog88@gmail.com wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/).
Lets ask ourselves a question if this is a very big problem for ES&Lucene
too.

I analyzed a few heap dumps from ES. The maximum size of the heap was set
below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around 1GB when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays pointed by
less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

Elasticsearch is very "memory-hungry" especially when using aggregations,
multi-dimensional aggregations and parent-child queries. I think sometimes
it is reasonable to have a bigger heap if we have enough free resources.

Of course we have to remember that the bigger heap means more work for GC
(and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFdzGQt1oTmyAYTsu7%3DcDK%3DXvUoey71DqPhbdot1hg%2Bsw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my last
e-mail. What I wanted to point out is that IMHO in ES I can get something
different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my analysis for
16G reachable objects (with Xmx 30G) from my calculations the overhead of
disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap usage (30G
from Xmx 30G) it would be 1G. This means that 30G heap will be always less
than eg. 32G or 33G heap in case of ES (at least for my query
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any mistake
during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:

There is no "trouble" at all, only a surprise effect to those who do not
understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work silently.
The challenge is, large object pointers waste some of the CPU memory
bandwith when JVM must access objects on a 64bit addressable heap. There is
a price to pay for encoding/decoding pointers, and that is performance.
Most people prefer memory efficiency over speed, so current Oracle JVM is
now enabling compressed OOPs by default. And this feature works only on
heaps less than ~30GB. If you configure a larger heap (for whatever reason)
you lose compressed OOP feature silently. Then you get better performance,
but with less heap object capacity. At a heap size of ~40G, you can again
store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg <pro...@gmail.com <javascript:>

wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/).
Lets ask ourselves a question if this is a very big problem for ES&Lucene
too.

I analyzed a few heap dumps from ES. The maximum size of the heap was set
below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around 1GB when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays pointed by
less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

Elasticsearch is very "memory-hungry" especially when using aggregations,
multi-dimensional aggregations and parent-child queries. I think sometimes
it is reasonable to have a bigger heap if we have enough free resources.

Of course we have to remember that the bigger heap means more work for GC
(and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I will not doubt your numbers.

The difference may depend on the application workload, how many heap
objects are created. ES is optimized to use very large heap objects to
decrease GC overhead. So I agree the difference for ES may be closer to
0.5 GB / 1 GB and not 8 GB.

Jörg

On Thu, Mar 26, 2015 at 4:44 PM, Paweł Róg prog88@gmail.com wrote:

Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my last
e-mail. What I wanted to point out is that IMHO in ES I can get something
different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my analysis for
16G reachable objects (with Xmx 30G) from my calculations the overhead of
disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap usage (30G
from Xmx 30G) it would be 1G. This means that 30G heap will be always less
than eg. 32G or 33G heap in case of ES (at least for my query
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any mistake
during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:

There is no "trouble" at all, only a surprise effect to those who do not
understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work
silently. The challenge is, large object pointers waste some of the CPU
memory bandwith when JVM must access objects on a 64bit addressable heap.
There is a price to pay for encoding/decoding pointers, and that is
performance. Most people prefer memory efficiency over speed, so current
Oracle JVM is now enabling compressed OOPs by default. And this feature
works only on heaps less than ~30GB. If you configure a larger heap (for
whatever reason) you lose compressed OOP feature silently. Then you get
better performance, but with less heap object capacity. At a heap size of
~40G, you can again store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg pro...@gmail.com wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-
32gb-java-jvm-memory-oddities/). Lets ask ourselves a question if this
is a very big problem for ES&Lucene too.

I analyzed a few heap dumps from ES. The maximum size of the heap was
set below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around 1GB when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays pointed
by less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

Elasticsearch is very "memory-hungry" especially when using
aggregations, multi-dimensional aggregations and parent-child queries. I
think sometimes it is reasonable to have a bigger heap if we have enough
free resources.

Of course we have to remember that the bigger heap means more work for
GC (and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%
3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG6nCxqvV0xLmOvpdoHO1mT5wtpb%2BUrzDi%3DYSECzLBnqw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,
Exactly, ES is optimized to use large objects (arrays of primitives). This
makes me think that documentation sometimes can be misleading. You can see
a bunch of places where "magic line" which shouldn't be crossed really
appear:
http://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#compressed_oops
http://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html

--
Paweł Róg

On Thursday, March 26, 2015 at 6:08:26 PM UTC+1, Jörg Prante wrote:

I will not doubt your numbers.

The difference may depend on the application workload, how many heap
objects are created. ES is optimized to use very large heap objects to
decrease GC overhead. So I agree the difference for ES may be closer to
0.5 GB / 1 GB and not 8 GB.

Jörg

On Thu, Mar 26, 2015 at 4:44 PM, Paweł Róg <pro...@gmail.com <javascript:>

wrote:

Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my last
e-mail. What I wanted to point out is that IMHO in ES I can get something
different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my analysis for
16G reachable objects (with Xmx 30G) from my calculations the overhead of
disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap usage (30G
from Xmx 30G) it would be 1G. This means that 30G heap will be always less
than eg. 32G or 33G heap in case of ES (at least for my query
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any mistake
during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:

There is no "trouble" at all, only a surprise effect to those who do not
understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work
silently. The challenge is, large object pointers waste some of the CPU
memory bandwith when JVM must access objects on a 64bit addressable heap.
There is a price to pay for encoding/decoding pointers, and that is
performance. Most people prefer memory efficiency over speed, so current
Oracle JVM is now enabling compressed OOPs by default. And this feature
works only on heaps less than ~30GB. If you configure a larger heap (for
whatever reason) you lose compressed OOP feature silently. Then you get
better performance, but with less heap object capacity. At a heap size of
~40G, you can again store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg pro...@gmail.com wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-
32gb-java-jvm-memory-oddities/). Lets ask ourselves a question if this
is a very big problem for ES&Lucene too.

I analyzed a few heap dumps from ES. The maximum size of the heap was
set below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around 1GB when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays pointed
by less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

Elasticsearch is very "memory-hungry" especially when using
aggregations, multi-dimensional aggregations and parent-child queries. I
think sometimes it is reasonable to have a bigger heap if we have enough
free resources.

Of course we have to remember that the bigger heap means more work for
GC (and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%
3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The statement "It wastes memory, reduces CPU performance, and makes the GC
struggle with large heaps." reads like there is a catastrophe waiting and
is a bit overstated. It may waste memory usable by the JVM heap, true. But
it does not reduce CPU performance - OOP with LP64 is exercising memory and
cache bandwith, not CPU. And "GC struggle" is alone to the method how GC
works with heap objetcs - not related to OOP. In fact, GC is a bit slower
with compressed OOP because of the overhead of encoding/decoding addresses.

Jörg

On Fri, Mar 27, 2015 at 8:53 AM, Paweł Róg prog88@gmail.com wrote:

Hi,
Exactly, ES is optimized to use large objects (arrays of primitives). This
makes me think that documentation sometimes can be misleading. You can see
a bunch of places where "magic line" which shouldn't be crossed really
appear:

http://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#compressed_oops

http://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html

--
Paweł Róg

On Thursday, March 26, 2015 at 6:08:26 PM UTC+1, Jörg Prante wrote:

I will not doubt your numbers.

The difference may depend on the application workload, how many heap
objects are created. ES is optimized to use very large heap objects to
decrease GC overhead. So I agree the difference for ES may be closer to
0.5 GB / 1 GB and not 8 GB.

Jörg

On Thu, Mar 26, 2015 at 4:44 PM, Paweł Róg pro...@gmail.com wrote:

Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my last
e-mail. What I wanted to point out is that IMHO in ES I can get something
different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my analysis for
16G reachable objects (with Xmx 30G) from my calculations the overhead of
disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap usage (30G
from Xmx 30G) it would be 1G. This means that 30G heap will be always less
than eg. 32G or 33G heap in case of ES (at least for my query
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any mistake
during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:

There is no "trouble" at all, only a surprise effect to those who do
not understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work
silently. The challenge is, large object pointers waste some of the CPU
memory bandwith when JVM must access objects on a 64bit addressable heap.
There is a price to pay for encoding/decoding pointers, and that is
performance. Most people prefer memory efficiency over speed, so current
Oracle JVM is now enabling compressed OOPs by default. And this feature
works only on heaps less than ~30GB. If you configure a larger heap (for
whatever reason) you lose compressed OOP feature silently. Then you get
better performance, but with less heap object capacity. At a heap size of
~40G, you can again store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg pro...@gmail.com wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-
java-jvm-memory-oddities/). Lets ask ourselves a question if this is
a very big problem for ES&Lucene too.

I analyzed a few heap dumps from ES. The maximum size of the heap was
set below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around 1GB when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays pointed
by less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

Elasticsearch is very "memory-hungry" especially when using
aggregations, multi-dimensional aggregations and parent-child queries. I
think sometimes it is reasonable to have a bigger heap if we have enough
free resources.

Of course we have to remember that the bigger heap means more work for
GC (and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%
3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEK7gDNTwyOXNreQEAgb5FY6tnU_0sdmN_5-DXQgaQT%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I think part of what you may be missing, is the intent that ElasticSearch
be scaled out rather than up. There are other issues that occur when you
scale up instead of out, the first of which is that losing a single node of
your cluster can be disastrous. It is also generally far more expensive to
scale up than scale out.

That said I am interested in this as it is increasingly common to have
128GB or 256 GB in a typical enterprise machine, that didn't break the
bank.

If I had access to such machines I would run some benchmarks to show the
differences. What does memory utilization look like after ingesting a
large number of docs or with a given query mix.

One other note, these cautions are not unique to ElasticSearch, they are
made with SOLR as well. I do know that it the impact of GC in large heaps
is very real, and a very powerful cluster can fall apart if things are not
tuned well, when they do run out of memory.

On Friday, March 27, 2015 at 2:36:00 AM UTC-6, Jörg Prante wrote:

The statement "It wastes memory, reduces CPU performance, and makes the GC
struggle with large heaps." reads like there is a catastrophe waiting and
is a bit overstated. It may waste memory usable by the JVM heap, true. But
it does not reduce CPU performance - OOP with LP64 is exercising memory and
cache bandwith, not CPU. And "GC struggle" is alone to the method how GC
works with heap objetcs - not related to OOP. In fact, GC is a bit slower
with compressed OOP because of the overhead of encoding/decoding addresses.

Jörg

On Fri, Mar 27, 2015 at 8:53 AM, Paweł Róg <pro...@gmail.com <javascript:>

wrote:

Hi,
Exactly, ES is optimized to use large objects (arrays of primitives).
This makes me think that documentation sometimes can be misleading. You can
see a bunch of places where "magic line" which shouldn't be crossed really
appear:

http://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#compressed_oops

http://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html

--
Paweł Róg

On Thursday, March 26, 2015 at 6:08:26 PM UTC+1, Jörg Prante wrote:

I will not doubt your numbers.

The difference may depend on the application workload, how many heap
objects are created. ES is optimized to use very large heap objects to
decrease GC overhead. So I agree the difference for ES may be closer to
0.5 GB / 1 GB and not 8 GB.

Jörg

On Thu, Mar 26, 2015 at 4:44 PM, Paweł Róg pro...@gmail.com wrote:

Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my
last e-mail. What I wanted to point out is that IMHO in ES I can get
something different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my
analysis for 16G reachable objects (with Xmx 30G) from my calculations the
overhead of disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap
usage (30G from Xmx 30G) it would be 1G. This means that 30G heap will be
always less than eg. 32G or 33G heap in case of ES (at least for my query
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any
mistake during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:

There is no "trouble" at all, only a surprise effect to those who do
not understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work
silently. The challenge is, large object pointers waste some of the CPU
memory bandwith when JVM must access objects on a 64bit addressable heap.
There is a price to pay for encoding/decoding pointers, and that is
performance. Most people prefer memory efficiency over speed, so current
Oracle JVM is now enabling compressed OOPs by default. And this feature
works only on heaps less than ~30GB. If you configure a larger heap (for
whatever reason) you lose compressed OOP feature silently. Then you get
better performance, but with less heap object capacity. At a heap size of
~40G, you can again store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg pro...@gmail.com wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-
java-jvm-memory-oddities/). Lets ask ourselves a question if this is
a very big problem for ES&Lucene too.

I analyzed a few heap dumps from ES. The maximum size of the heap was
set below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around 1GB when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays
pointed by less than 20M references. As we can see ES&Lucene use a lot of
arrays of primitives.

Elasticsearch is very "memory-hungry" especially when using
aggregations, multi-dimensional aggregations and parent-child queries. I
think sometimes it is reasonable to have a bigger heap if we have enough
free resources.

Of course we have to remember that the bigger heap means more work
for GC (and currently used in JVM: CMS or G1 are not very efficient for
large heaps), but ... Is there really a magical line (32GB) after crossing
we get into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%
3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a1544b11-03e0-4f96-8fef-e5cfb5df1f2d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg prog88@gmail.com wrote:

When we sum all of sizes we can see 13.5GB of primitive arrays pointed by
less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

I think it depends what takes memory in your heap. For instance, fielddata
mainly uses very large primitive arrays, so the impact of compressed oops
is almost null. The same would be true for aggregations and p/c queries. On
the other hand the filter cache happily uses objects in order to model the
keys (filter/segment pairs) and values (sets of doc ids). So if you have a
large filter cache that contains lots of small entries, compressed oops
could help a bit. This might be true for users who have large collections
of percolator queries too.

Of course we have to remember that the bigger heap means more work for GC

(and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

Thanks for bringing this up. I agree this is very important and this is a
reason why eg. elasticsearch is moving to doc values by default on
not_analyzed fields in 2.0:
https://github.com/elastic/elasticsearch/pull/10209 (which effectively
moves fielddata memory usage from the heap to the OS cache).

The main issue about large heaps is when they make major garbage
collections long enough so that they look like the node left from a cluster
perspective. I don't think there is a magical line since it also depends on
the hardware and the complexity of the graph of objects but it is probably
something to keep an eye on.

--
Adrien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAh%3D5QLkE5F0edwqZpNPzcQHVuQ1YOku18cBxHvLcWJm8g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

On Friday, March 27, 2015 at 5:02:05 PM UTC+1, Aaron Mefford wrote:

I think part of what you may be missing, is the intent that ElasticSearch
be scaled out rather than up. There are other issues that occur when you
scale up instead of out, the first of which is that losing a single node of
your cluster can be disastrous. It is also generally far more expensive to
scale up than scale out.

I don't mean that I doubt in scaling out. This topic touches the "magic
line" of 32G heap size, not replacing multiple nodes with a single node.
Please remember that you can still have 50 machines of 40G heap for
example. And please also don't forget that distributing has also some
drawbacks.

That said I am interested in this as it is increasingly common to have
128GB or 256 GB in a typical enterprise machine, that didn't break the
bank.

If I had access to such machines I would run some benchmarks to show the
differences. What does memory utilization look like after ingesting a
large number of docs or with a given query mix.

One other note, these cautions are not unique to ElasticSearch, they are
made with SOLR as well. I do know that it the impact of GC in large heaps
is very real, and a very powerful cluster can fall apart if things are not
tuned well, when they do run out of memory.

On Friday, March 27, 2015 at 2:36:00 AM UTC-6, Jörg Prante wrote:

The statement "It wastes memory, reduces CPU performance, and makes the
GC struggle with large heaps." reads like there is a catastrophe waiting
and is a bit overstated. It may waste memory usable by the JVM heap, true.
But it does not reduce CPU performance - OOP with LP64 is exercising memory
and cache bandwith, not CPU. And "GC struggle" is alone to the method how
GC works with heap objetcs - not related to OOP. In fact, GC is a bit
slower with compressed OOP because of the overhead of encoding/decoding
addresses.

Jörg

On Fri, Mar 27, 2015 at 8:53 AM, Paweł Róg pro...@gmail.com wrote:

Hi,
Exactly, ES is optimized to use large objects (arrays of primitives).
This makes me think that documentation sometimes can be misleading. You can
see a bunch of places where "magic line" which shouldn't be crossed really
appear:

http://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#compressed_oops

http://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html

--
Paweł Róg

On Thursday, March 26, 2015 at 6:08:26 PM UTC+1, Jörg Prante wrote:

I will not doubt your numbers.

The difference may depend on the application workload, how many heap
objects are created. ES is optimized to use very large heap objects to
decrease GC overhead. So I agree the difference for ES may be closer to
0.5 GB / 1 GB and not 8 GB.

Jörg

On Thu, Mar 26, 2015 at 4:44 PM, Paweł Róg pro...@gmail.com wrote:

Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my
last e-mail. What I wanted to point out is that IMHO in ES I can get
something different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my
analysis for 16G reachable objects (with Xmx 30G) from my calculations the
overhead of disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap
usage (30G from Xmx 30G) it would be 1G. This means that 30G heap will be
always less than eg. 32G or 33G heap in case of ES (at least for my query
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any
mistake during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:

There is no "trouble" at all, only a surprise effect to those who do
not understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work
silently. The challenge is, large object pointers waste some of the CPU
memory bandwith when JVM must access objects on a 64bit addressable heap.
There is a price to pay for encoding/decoding pointers, and that is
performance. Most people prefer memory efficiency over speed, so current
Oracle JVM is now enabling compressed OOPs by default. And this feature
works only on heaps less than ~30GB. If you configure a larger heap (for
whatever reason) you lose compressed OOP feature silently. Then you get
better performance, but with less heap object capacity. At a heap size of
~40G, you can again store as many heap objects as with ~30GB.

Jörg

On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg pro...@gmail.com wrote:

Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-
java-jvm-memory-oddities/). Lets ask ourselves a question if this
is a very big problem for ES&Lucene too.

I analyzed a few heap dumps from ES. The maximum size of the heap
was set below magical boundary (Xmx was 30GB). In all cases I can see
similar pattern but let's discuss it based on a single example. One heap
dump I took had around 16GB (slightly more) of reachable objects in it.
There were about 70M objects. Of course I cannot just take 70M to see how
much of the heap I can save by having OOPs enabled but I also tried to
analyze the number of references to objects (because some objects are
referenced multiple times from multiple places). This gave me a number
around 110M inbound references so OOPs let us save about 0.5GB of memory so
when we try to estimate, this would mean around 1GB when whole the heap
is currently in use (as I wrote earlier only 16GB of reachable objects were
in heap) - for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays
pointed by less than 20M references. As we can see ES&Lucene use a lot of
arrays of primitives.

Elasticsearch is very "memory-hungry" especially when using
aggregations, multi-dimensional aggregations and parent-child queries. I
think sometimes it is reasonable to have a bigger heap if we have enough
free resources.

Of course we have to remember that the bigger heap means more work
for GC (and currently used in JVM: CMS or G1 are not very efficient for
large heaps), but ... Is there really a magical line (32GB) after crossing
we get into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXb
dzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0d6fd839-c412-476c-86a1-09c87b492544%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12b8a404-fa0a-4baa-a4cd-67bfe7ef2c66%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ebf7a8b9-bf05-4ab2-8a06-6e2dded3cf7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Adrien,
Thanks for your response.

On Sunday, March 29, 2015 at 10:34:07 PM UTC+2, Adrien Grand wrote:

I think it depends what takes memory in your heap. For instance, fielddata
mainly uses very large primitive arrays, so the impact of compressed oops
is almost null. The same would be true for aggregations and p/c queries. On
the other hand the filter cache happily uses objects in order to model the
keys (filter/segment pairs) and values (sets of doc ids).

I can absolutely agree that the number of references in JVM depend on the
query characteristics and agree that in my case aggregations are widely
used which confirms the number of big objects in the heap.

So if you have a large filter cache that contains lots of small entries,

compressed oops could help a bit. This might be true for users who have
large collections of percolator queries too.

That's right but without analysis and counting number of instances for both
large filter cache and large number of percolator queries (depending what
big number means here) it's hard to say what does it meant that compressed
oops "could help a bit*"*

I agree this is very important and this is a reason why eg. elasticsearch

is moving to doc values by default on not_analyzed fields in 2.0:
https://github.com/elastic/elasticsearch/pull/10209 (which effectively
moves fielddata memory usage from the heap to the OS cache).

I think doc values also have some overhead. When I was analyzing the heap
dump I could meet a lot of WeakReferences which come from doc values. From
one side they are only weak references. From other side I don't know if
ES/Lucene remove them from heap by some mechanism or more and more of them
are added to the heap and they are removed only when OOM happens and FullGC
is triggered by JVM.

The main issue about large heaps is when they make major garbage
collections long enough so that they look like the node left from a cluster
perspective. I don't think there is a magical line since it also depends on
the hardware and the complexity of the graph of objects but it is probably
something to keep an eye on.

Again agree but this "long GC" is probably connected with Full GC which can
STW long enough but as you also mentioned "don't think there is a magical
line".

--
Paweł Róg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c17cba43-b706-4a8b-802e-408f6e860fa5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.