Understanding off-heap usage

(Erik Stephens) #1

After upgrading a cluster from 2.3 to 5.5, I'm noticing an increase in total memory. I don't have enough data yet, but it also looks to grow (slowly) unbounded until OOM. I've read other accounts (not strictly elasticsearch related) how this behavior can look like a memory leak but really it's more like an optimization ("PSST! Don't really free this memory because I'm going to need it right back, ok?").

Elasticsearch is running inside a container. It has 30G heap. The host system has 128G. Using openjdk 1.8.0_131.

1st question, how can I restrict the amount of off-heap usage in order to avoid OOM's? I tried setting -XX:MaxDirectMemorySize and also tried MALLOC_ARENA_MAX=2. The malloc arena experiment didn't yield much different behavior: total resident memory quickly grows 80+G. Setting MaxDirectMemorySize=32G improved the situation slightly: resident memory reached around 64G but still continued to climb, albeit at a much slower rate.

2nd question, how can I better understand what is driving the off-heap usage? An increase in this usage from 2.x to 5.x was expected since doc_values are the default. Any tips or references are greatly appreciated. Thanks!

(Mark Walkom) #2

A JVM OOM and off heap memory are in no way related. The later is managed entirely by the OS and the heap is managed by the JVM.

Can you provide more details on the OOM and the off heap use that has you concerned?

(Erik Stephens) #3

It's not JVM OOM. It's OS OOM:

kernel: Out of memory: Kill process 30036 (java) score 777 or sacrifice child

I'm looking for ways to identify or instrument the parts of elasticsearch that are responsible for off-heap usage. I have a shallow understanding that it's mostly from memory mapped byte buffers. My expectation was that -XX:MaxDirectMemorySize would limit that but it's not.

(Erik Stephens) #4

I'm pretty sure it's not from thread stack size. We do see more threads in v5 than v2, but not enough to account for the extra usage. Looking at MetaSpace size now. Curious if anyone has insight into what in elasticsearch could be driving that. Dynamic classes?

(Erik Stephens) #5

I think jstat cleared MetaSpace as possible culprit. I've got an elasticsearch process with 103G RSS with max heap at 30G and max direct memory size at 32G. That's about 40G that I can't account for.

root@dd3f5f814ec1:/# show-stats ()
> {
>     local stat=$1;
>     local path=$2;
>     local data=$(jstat -$stat "file://$path");
>     local fields=($(sed -n 1p <<<"$data"));
>     local vals=($(sed -n 2p <<<"$data"));
>     for k in ${!fields[@]};
>     do
>         echo ${fields[$k]}=${vals[$k]};
>     done
> }

root@dd3f5f814ec1:/# show-stats gccapacity /proc/17667/root/tmp/hsperfdata_elasticsearch/1

From the jstat man page:

MC: Metaspace capacity (kB).
MU: Metacspace utilization (kB).
CCSC: Compressed class space capacity (kB).
CCSU: Compressed class space used (kB).

I've read multiple accounts of this flavor of compressor/deflator leak. I'm seeing slow leak-like behavior across entire cluster.

When stored fields are configured with BEST_COMPRESSION, we rely on garbage collection to reclaim Deflater/Inflater instances. However these classes use little JVM memory but may use significant native memory, so if may happen that the OS runs out of native memory before the JVM collects these unreachable Deflater/Inflater instances. We should look into reclaiming native memory more aggressively.

And similar in elasticsearch code:

The expectation that those get garbage collected has me wondering how many handful of opportunities for leakage.

Not sure how much the developers pay attention to this forum. Any of this github issue worthy?

(Mark Walkom) #6

Please do feel free to raise a github issue :slight_smile:

(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.