Elasticsearch version: 5.5.1
Search Guard version: 5.5.1 (including netty tcnative 2.0.1)
OS: SUSE Linux Enterprise Server 12 (x86_64)
Kernel version: 4.4.114-94.11-default
java version: "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
Elasticsearch index size: 3.3 GB (it becomes 3.0 GB if I force merge it to a single segment)
RAM: 16 GB
Java heap size given to Elasticsearch: 3.9 GB
When we run load test on this system, the server behaves well at the beginning. But eventually, within a few hours, it runs out of memory and got killed by OS OOM-killer.
The monitoring shows that the heap space is managed well, and there's no sign of memory leak. The system never encounters out of memory error in the heap space, and the time it spends on GC is within normal range.
The problem is with the ever growing non-heap memory of the JVM. When I first start Elasticsearch server, the virtual memory size of the process is around 10 GB, which seems reasonable given it's heap space and the size of the index files. However, rather quickly over time, this virtual memory size grows unwieldy along with the resident memory size, pushing the system out of memory, with the Elasticsearch process eventually getting killed by the kernel OOM-reaper.
Here's the output of "cat /proc//status" command obtained shortly before the process died.
search1:/var/opt/elasticsearch/logs # cat /proc/5171/status
State: S (sleeping)
Uid: 498 498 498 498
Gid: 498 498 498 498
Groups: 15 498
VmPeak: 29431656 kB
VmSize: 27339768 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 18658992 kB
VmRSS: 17729396 kB
RssAnon: 16920968 kB
RssFile: 808428 kB
RssShmem: 0 kB
VmData: 23978388 kB
VmStk: 132 kB
VmExe: 4 kB
VmLib: 20244 kB
VmPTE: 38944 kB
VmPMD: 136 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
Why is the non-heap memory usage growing this large when the size of the index is only around 3+ GB? What can I do to keep the memory usage under control and prevent the process from getting killed by OOM-reaper? Any insights would be greatly appreciated.