Non-heap memory usage growing out of control (leak?)

Elasticsearch version: 5.5.1
Search Guard version: 5.5.1 (including netty tcnative 2.0.1)
OS: SUSE Linux Enterprise Server 12 (x86_64)
Kernel version: 4.4.114-94.11-default
java version: "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)

Elasticsearch index size: 3.3 GB (it becomes 3.0 GB if I force merge it to a single segment)
RAM: 16 GB
Java heap size given to Elasticsearch: 3.9 GB

When we run load test on this system, the server behaves well at the beginning. But eventually, within a few hours, it runs out of memory and got killed by OS OOM-killer.

The monitoring shows that the heap space is managed well, and there's no sign of memory leak. The system never encounters out of memory error in the heap space, and the time it spends on GC is within normal range.

The problem is with the ever growing non-heap memory of the JVM. When I first start Elasticsearch server, the virtual memory size of the process is around 10 GB, which seems reasonable given it's heap space and the size of the index files. However, rather quickly over time, this virtual memory size grows unwieldy along with the resident memory size, pushing the system out of memory, with the Elasticsearch process eventually getting killed by the kernel OOM-reaper.

Here's the output of "cat /proc//status" command obtained shortly before the process died.

search1:/var/opt/elasticsearch/logs # cat /proc/5171/status
Name: java
State: S (sleeping)
Tgid: 5171
Ngid: 0
Pid: 5171
PPid: 1
TracerPid: 0
Uid: 498 498 498 498
Gid: 498 498 498 498
FDSize: 512
Groups: 15 498
NStgid: 5171
NSpid: 5171
NSpgid: 5171
NSsid: 5171
VmPeak: 29431656 kB
VmSize: 27339768 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 18658992 kB
VmRSS: 17729396 kB
RssAnon: 16920968 kB
RssFile: 808428 kB
RssShmem: 0 kB
VmData: 23978388 kB
VmStk: 132 kB
VmExe: 4 kB
VmLib: 20244 kB
VmPTE: 38944 kB
VmPMD: 136 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
Threads: 64
SigQ: 0/80244
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 2000000181005ccf
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
Seccomp: 2
Cpus_allowed: f
Cpus_allowed_list: 0-3
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 23
nonvoluntary_ctxt_switches: 11
search1:/var/opt/elasticsearch/logs #

Why is the non-heap memory usage growing this large when the size of the index is only around 3+ GB? What can I do to keep the memory usage under control and prevent the process from getting killed by OOM-reaper? Any insights would be greatly appreciated.

Thanks
/Jong

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.