Differences in memory handling between 0.16.x and 0.17.x

I'm only now getting round to migrating from 0.16.2 to 0.17.9 (I hate
changing stuff that works!!)

I have a single test machine that has 8GB and some other stuff
running, so I restrict ES to 1GB of memory, this used to work fine
with all the queries I'd ever make.

Since upgrading to 0.17.9 however, generating facets uses up all the
memory and causes the elasticsearch process to hang

An example query is in the following gist: https://gist.github.com/1370278

(Note that the query is unnecessarily complex simply because it's auto-
generated by code that expects much more complex logic than this test)

When executed without the facets, the correct documents are returned.
When executed with the facets the memory jumps to the maximum and the
CPU thrashes "forever" (and the query "never" returns):

#bash> top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6709 elastics 25 0 1360m 1.1g 10m S 49.3 14.6 0:59.82
java

(If I run the same facet over a subset of the data, the memory jumps
to about 900M and the system continues to work)

The log doesn't generate out-of-memory errors on the query, though it
sometimes gets upsets when I try to close the query connection, eg:

2011-11-16 09:43:24.293 [WARN] transport.netty:91 - [Ms. Steed]
Exception caught on netty layer [[id: 0x744145b1, /REMOTEIP:61282 => /
LOCALIP:9300]]
2011-11-16 09:43:28.523 [WARN] netty.channel.DefaultChannelPipeline:91

  • An exception was thrown by a user handler while handling an
    exception event ([id: 0x744145b1, /REMOTEIP:61282 => /LOCALIP:9300]
    EXCEPTION: java.io.StreamCorruptedException: invalid data length: 0)
    java.lang.OutOfMemoryError: Java heap space
    2011-11-16 09:43:51.276 [WARN] transport.netty:91 - [Ms. Steed]
    Exception caught on netty layer [[id: 0x673a95af, /LOCALIP:55230 => /
    LOCALIP:9300]]
    java.lang.OutOfMemoryError: Java heap space

So I can see a few possibilities:
1] Lucene 3.4 is just more memory hungry and I'll have to live with
that
2] Some setting in my facet doesn't make sense and I was just lucky
with Lucene 3.1
3] Some change to elasticsearch makes it more memory hungry, and maybe
I have to live with that / maybe it's a bug
4] I'm missing some configuration setting which will make it handle
the memory more gracefully

Any thoughts much appreciated!

FWIW, I have been running my "deployment" nodes with "-Xms6656m -
Xmx6656m -Xmn2048m" (after about 4 months of continuous use -
including far more and more complex facets and queries, the CPU
started to get busy doing GC, though it still responded fairly quickly
ie wasn't quite the same issue described above ... I was going to
write a cron job that cycled through the nodes restarting one at a
time over a period of a month or so - any thoughts on that strategy?

(Oh and apologies if something like this has already been discussed in
a similar thread - a quick search didn't throw anything up -, I
haven't been hanging out here as much as I used to because I haven't
needed to do anything to elasticsearch, functional or otherwise, it
just worked perfectly!)

After some more searching, I guess this is similar to
https://groups.google.com/group/elasticsearch/browse_thread/thread/aa46dda29dd3a08a/a7509b3be344c3ce?hl=en&lnk=gst&q=memory#a7509b3be344c3ce

Conclusion from that thread: more memory needed! (Plus clearing cache
tricks, which wouldn't appear to help with my queries, which normally
facet over large numbers of documents)

I am still worried that there does seem to be a significant memory
usage difference somewhere between 0.16.2 (Lucene 3.1) and 0.17.9
(Lucene 3.4), I don't want to find after upgrade that I need 4x the
memory on my operational system

Updates following some more investigation:

1] On the original reported "problem", I increased heap size to 2GB
and reran and it worked fine. The total field cache was 1.2GB (from
2194360 documents/2.9GB fwiw).

I'd still be very interested in your thoughts on my empirical claim
that memory usage has increased significantly from 0.16.2...

Does the cache gets flushed if space is needed? Eg if I have 3 fields
loaded for 1GB each, with no spare heap size, and I try to load a 4th
for another 1GB ... will it flush one of the three (as if I'd called
"clear cache") and then load the 4th?

In terms of being more robust to field space > cache size, have you
considered (/do you already do) something simple like caching then
flushing one index at a time (obviously only if necessary) so the
facet calculation slows down but the system won't die (unless a single
index/shard grows too big, which is obviously "easily" manageable by
creating new indexes and aliasing)?

2] While investigating the original problem, I came across some
interesting behavior on my 0.16.2 cluster and have a couple of
questions about that. I mentioned it earlier, but in summary: 2 of the
nodes have 80% of the documents and both of them have RES almost as
high as VIRT. Both of them are using a lot of CPU (*), and the query
performance has definitely slow (though remains -just about-
acceptable)

It doesn't appear that this is GC-related as I had originally guessed:
the collection time is 45 minutes (uptime 153 hours, though it's not
clear how uniformly that's distributed), but only changed by a few
seconds over the 10 minutes I've been writing this.

It also doesn't appear that this is related to [1]: according to
elasticsearch-head the filter usage is ~3.5GB/node (field evictions is
0, whatever that means; other memory stats/node: total index size
5-6GB, heap committed 6.3GB, heap used ~4.5GB)

Should I just not worry about this until I'm up to (eg) 0.17.9 and the
"problem" persists?

Or if not, is there any logging I can turn on (it may take a few days
after a restart for the "problem" to reoccur)

(*) Eg 10x10s iterations of "top", nothing is reading or writing to
the index over this time:

30344 elastics 18 0 7175m 6.8g 5332 S 11.9 45.2 2019:05 java
30344 elastics 18 0 7175m 6.8g 5332 S 60.6 45.2 2019:11 java
30344 elastics 18 0 7175m 6.8g 5332 S 48.8 45.2 2019:16 java
30344 elastics 18 0 7175m 6.8g 5332 S 42.4 45.2 2019:20 java
30344 elastics 18 0 7175m 6.8g 5332 S 62.8 45.2 2019:27 java
30344 elastics 18 0 7175m 6.8g 5332 S 51.4 45.2 2019:32 java
30344 elastics 18 0 7175m 6.8g 5332 S 38.6 45.2 2019:36 java
30344 elastics 18 0 7175m 6.8g 5332 S 60.7 45.2 2019:42 java
30344 elastics 18 0 7175m 6.8g 5332 S 79.9 45.2 2019:50 java
30344 elastics 18 0 7175m 6.8g 5332 S 28.4 45.2 2019:52 java

Hi,

The amount of memory required for facets did not change between 0.16.x
and 0.17.x . What did change is moving to a resident field cache instead of
soft one (which should, though not always, evict caches based on memory
pressure). The reason for that is that field data cache is not really a
cache per se, it should be loaded all time in memory, since the act of
loading it back is very expensive, especially if you have repeating search
requests that do sorting / faceting. The capacity you have
should accommodate those memory requirements (at least until there are
other storage options for that "cache").

-shay.banon

On Thu, Nov 17, 2011 at 3:40 AM, Alex at Ikanow apiggott@ikanow.com wrote:

Updates following some more investigation:

1] On the original reported "problem", I increased heap size to 2GB
and reran and it worked fine. The total field cache was 1.2GB (from
2194360 documents/2.9GB fwiw).

I'd still be very interested in your thoughts on my empirical claim
that memory usage has increased significantly from 0.16.2...

Does the cache gets flushed if space is needed? Eg if I have 3 fields
loaded for 1GB each, with no spare heap size, and I try to load a 4th
for another 1GB ... will it flush one of the three (as if I'd called
"clear cache") and then load the 4th?

In terms of being more robust to field space > cache size, have you
considered (/do you already do) something simple like caching then
flushing one index at a time (obviously only if necessary) so the
facet calculation slows down but the system won't die (unless a single
index/shard grows too big, which is obviously "easily" manageable by
creating new indexes and aliasing)?

2] While investigating the original problem, I came across some
interesting behavior on my 0.16.2 cluster and have a couple of
questions about that. I mentioned it earlier, but in summary: 2 of the
nodes have 80% of the documents and both of them have RES almost as
high as VIRT. Both of them are using a lot of CPU (*), and the query
performance has definitely slow (though remains -just about-
acceptable)

It doesn't appear that this is GC-related as I had originally guessed:
the collection time is 45 minutes (uptime 153 hours, though it's not
clear how uniformly that's distributed), but only changed by a few
seconds over the 10 minutes I've been writing this.

It also doesn't appear that this is related to [1]: according to
elasticsearch-head the filter usage is ~3.5GB/node (field evictions is
0, whatever that means; other memory stats/node: total index size
5-6GB, heap committed 6.3GB, heap used ~4.5GB)

Should I just not worry about this until I'm up to (eg) 0.17.9 and the
"problem" persists?

Or if not, is there any logging I can turn on (it may take a few days
after a restart for the "problem" to reoccur)

(*) Eg 10x10s iterations of "top", nothing is reading or writing to
the index over this time:

30344 elastics 18 0 7175m 6.8g 5332 S 11.9 45.2 2019:05 java
30344 elastics 18 0 7175m 6.8g 5332 S 60.6 45.2 2019:11 java
30344 elastics 18 0 7175m 6.8g 5332 S 48.8 45.2 2019:16 java
30344 elastics 18 0 7175m 6.8g 5332 S 42.4 45.2 2019:20 java
30344 elastics 18 0 7175m 6.8g 5332 S 62.8 45.2 2019:27 java
30344 elastics 18 0 7175m 6.8g 5332 S 51.4 45.2 2019:32 java
30344 elastics 18 0 7175m 6.8g 5332 S 38.6 45.2 2019:36 java
30344 elastics 18 0 7175m 6.8g 5332 S 60.7 45.2 2019:42 java
30344 elastics 18 0 7175m 6.8g 5332 S 79.9 45.2 2019:50 java
30344 elastics 18 0 7175m 6.8g 5332 S 28.4 45.2 2019:52 java

Shay,

Thanks for the reply! Apologies for being slow, but can you just
confirm I have understood you correctly:

The (likely) reason for the difference in ES 0.16 (supported the
query) and ES 0.17 (ran out of memory, went into hung state) was that:

  • ES 0.16 was calculating one facet, flushing its soft cache, then
    calculating the next facet (and each individual field did fit into
    memory)
  • ES 0.17 was trying to load all fields in the query into memory, and
    this exceeded the available heap

Does that sound about right?

And if so, and my 0.16.x nodes currently use <4GB of field cache and
have 6GB of heap available - does that mean I can safely migrate to
0.17? (or is there some equation that 0.16 uses to maintain spare heap
space?)

One (imminent) use case that I have is allowing users to specify semi-
arbitrary facets based on fields they specify, which as I understand
it from ES 0.17 would reside forever more in memory (and clearly break
the system quite quickly). Can I use the "clear cache" with the fields
parameter to manage this, eg would the following logic work:

On receive a faceted query:

  • while heap space "low":
    • clear cache for least frequently used field
  • Issue faceted query
  • Update my list of field usage frequency
    ?

Two more quick questions, then I'll leave you alone :slight_smile:

1] Is there a max % of the heap the field cache can use (eg if I have
6GB can I safely load 4GB of field "cache"? 5GB? "Safely" =
performance may deteriorate but the system will not die)

2] Is there a way of estimating the amount of field "cache" that a
given field will use? (It doesn't matter if it's a bit slow to
calculate, my data size grows pretty gradually)

Many thanks, as always, for any further insight you can provide.

Alex