Elasticsearch pod uses huge kmem especially entry

Elasticsearch version : 5.6.1

Plugins installed :

** Docker version** : 18.09.2

** K8S version**: 1.11

** linux kernel **: 3.10.0-862.e17.x86_64

I set the pod memory limit to 10GB. After about 5 days, the node running the es pod goes to crash.

dmesg says:

[642401.941726] Call Trace:
[642401.941737]  [<ffffffffab313754>] dump_stack+0x19/0x1b
[642401.941743]  [<ffffffffab30e91f>] dump_header+0x90/0x229
[642401.941751]  [<ffffffffaad9a7e6>] ? find_lock_task_mm+0x56/0xc0
[642401.941757]  [<ffffffffaae0f678>] ? try_get_mem_cgroup_from_mm+0x28/0x60
[642401.941761]  [<ffffffffaad9ac94>] oom_kill_process+0x254/0x3d0
[642401.941765]  [<ffffffffaae13486>] mem_cgroup_oom_synchronize+0x546/0x570
[642401.941769]  [<ffffffffaae12900>] ? mem_cgroup_charge_common+0xc0/0xc0
[642401.941772]  [<ffffffffaad9b524>] pagefault_out_of_memory+0x14/0x90
[642401.941776]  [<ffffffffab30cac1>] mm_fault_error+0x6a/0x157
[642401.941782]  [<ffffffffab320846>] __do_page_fault+0x496/0x4f0
[642401.941786]  [<ffffffffab3208d5>] do_page_fault+0x35/0x90
[642401.941790]  [<ffffffffab31c758>] page_fault+0x28/0x30
[642401.941795] Task in /kubepods/pod5df23e59-a37a-11e9-9ebb-000af7c93b90/8edae9bfeed37d826f19a1873bedb433c4223185b0d11e5daa823e1ef1eecd05 killed as a result of limit of /kubepods/pod5df23e59-a37a-11e9-9ebb-000af7c93b90
[642401.941799] memory: usage 10485760kB, limit 10485760kB, failcnt 10987946
[642401.941801] memory+swap: usage 10485760kB, limit 9007199254740988kB, failcnt 0
[642401.941804] kmem: usage 2335668kB, limit 9007199254740988kB, failcnt 0
[642401.941805] Memory cgroup stats for /kubepods/pod5df23e59-a37a-11e9-9ebb-000af7c93b90: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
[642401.941827] Memory cgroup stats for /kubepods/pod5df23e59-a37a-11e9-9ebb-000af7c93b90/f099264850581d1709f1302ec25c4d22fefdb7a20c46741ccb173580a6e18f20: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
[642401.941847] Memory cgroup stats for /kubepods/pod5df23e59-a37a-11e9-9ebb-000af7c93b90/8edae9bfeed37d826f19a1873bedb433c4223185b0d11e5daa823e1ef1eecd05: cache:32KB rss:8149400KB rss_huge:6936576KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:8150024KB inactive_file:16KB active_file:0KB unevictable:0KB
[642401.941868] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[642401.942600] [100253]   101 100253  5996909  1763139    3907        0          -998 java
[642401.942607] [100406]     0 100406     4975      470      14        0          -998 create_mapping.
[642401.942611] [100586]   101 100586    31804      304      31        0          -998 controller
[642401.942615] [100929]     0 100929     4977      471      14        0          -998 sgadmin.sh
[642401.942620] [100933]     0 100933  9996270    92644     428        0          -998 java
[642401.942633] Memory cgroup out of memory: Kill process 101686 (java) score 0 or sacrifice child
[642401.943139] Killed process 100406 (create_mapping.) total-vm:19900kB, anon-rss:492kB, file-rss:1388kB, shmem-rss:0kB
[642401.948563] java invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=-998
[642401.948569] java cpuset=8edae9bfeed37d826f19a1873bedb433c4223185b0d11e5daa823e1ef1eecd05 mems_allowed=0-1
[642401.948574] CPU: 2 PID: 101673 Comm: java Kdump: loaded Tainted: G               ------------ T 3.10.0-862.14.4.el7.x86_64 #1
[642401.948576] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.7.1 001/22/2018

The kmem: usage 2335668kB is quite huge so i change the memory limit to 15G temporally, and NOW the kmem.usage_in_bytes is 2535668kB and keep increasing. I am afraid it will be OOM killed days later.

I have seen the issue: https://github.com/elastic/elasticsearch/issues/23063

My problem is not exactly the same, i run in docker containers.
I check the memory info from the memory cgroup:

memory.kmem.usage_in_bytye: 1.4G, which is now keep increasing about 500-800M daily, mainly are slab pages and dentry occupied the most.

I keep watching and find dentry will never drop down.

I also find out that may be caused by cgroup leak. as mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1507149

any idea? Is it a just a kernel bug? or I should not enable kmem accounting?