New Elasticsearch installation takes all memory during the startup results get killed by OOM (7.15.2)

I was trying to install Elasticsearch last few hours and I was unable to do so. It turns out systemd got timed out trying to start the elasticsearch service. Initially I was trying to install it in a vm of 2 Gig mem (oracle linux, centos, ubuntu).
Now I have tried with my laptop and it seems without any indices or data it consumes 7gb of mem just after it get started. Following are some not so specific logs so you might want to test it when you get some time.

Dec 06 11:23:28 arif-laptop kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/elasticsearch.service,task=java,pid=52468,uid=133
Dec 06 11:23:28 arif-laptop kernel: Out of memory: Killed process 52468 (java) total-vm:11727664kB, anon-rss:8436956kB, file-rss:0kB, shmem-rss:0kB, UID:133 pgtables:16924kB oom_score_adj:0
Dec 06 11:23:28 arif-laptop kernel: oom_reaper: reaped process 52468 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Dec 06 11:23:28 arif-laptop systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
Dec 06 11:23:28 arif-laptop systemd[1]: elasticsearch.service: Failed with result 'signal'.
Dec 06 11:23:28 arif-laptop systemd[1]: Failed to start Elasticsearch.

I just tried to start the service and it got killed instantly

β•°β”€β”€βž€ date; free -h; sc-start elasticsearch; date ; free -h ; sc-stop elasticsearch ; free -h
Mon Dec  6 11:57:03 +06 2021
              total        used        free      shared  buff/cache   available
Mem:           15Gi       7.4Gi       3.7Gi       663Mi       4.3Gi       7.0Gi
Swap:         2.0Gi       1.2Gi       832Mi
Job for elasticsearch.service failed because a fatal signal was delivered to the control process.
See "systemctl status elasticsearch.service" and "journalctl -xe" for details.
Mon Dec  6 11:57:07 +06 2021
              total        used        free      shared  buff/cache   available
Mem:           15Gi       6.8Gi       7.8Gi       603Mi       893Mi       7.7Gi
Swap:         2.0Gi       2.0Gi       4.0Mi
              total        used        free      shared  buff/cache   available
Mem:           15Gi       6.8Gi       7.7Gi       603Mi       899Mi       7.7Gi
Swap:         2.0Gi       2.0Gi       4.0Mi

Journal log,

Dec 06 11:57:03 arif-laptop systemd[1]: Starting Elasticsearch...
Dec 06 11:57:07 arif-laptop kernel: qemu-system-x86 invoked oom-killer: gfp_mask=0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0
Dec 06 11:57:07 arif-laptop kernel: CPU: 7 PID: 17801 Comm: qemu-system-x86 Tainted: G           OE     5.11.0-41-generic #45~20.04.1-Ubuntu
Dec 06 11:57:07 arif-laptop kernel: Hardware name:
Dec 06 11:57:07 arif-laptop kernel: Call Trace:
Dec 06 11:57:07 arif-laptop kernel:  dump_stack+0x74/0x92
Dec 06 11:57:07 arif-laptop kernel:  dump_header+0x4f/0x1f6
Dec 06 11:57:07 arif-laptop kernel:  oom_kill_process.cold+0xb/0x10
Dec 06 11:57:07 arif-laptop kernel:  out_of_memory.part.0+0x1ee/0x460
Dec 06 11:57:07 arif-laptop kernel:  out_of_memory+0x6d/0xd0
Dec 06 11:57:07 arif-laptop kernel:  __alloc_pages_slowpath.constprop.0+0xc4d/0xd20
Dec 06 11:57:07 arif-laptop kernel:  __alloc_pages_nodemask+0x2a0/0x300
Dec 06 11:57:07 arif-laptop kernel:  alloc_pages_current+0x87/0xe0
Dec 06 11:57:07 arif-laptop kernel:  __get_free_pages+0x11/0x40
Dec 06 11:57:07 arif-laptop kernel:  kvm_mmu_topup_memory_cache+0x5c/0x80 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  mmu_topup_memory_caches+0x3d/0x80 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  direct_page_fault+0xb6/0x4e0 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  ? mtrr_lookup_start.constprop.0+0x75/0xa0 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  ? kvm_mtrr_check_gfn_range_consistency+0x61/0x120 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  kvm_tdp_page_fault+0x77/0x90 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  kvm_mmu_page_fault+0x67/0x150 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  handle_ept_violation+0x111/0x390 [kvm_intel]
Dec 06 11:57:07 arif-laptop kernel:  vmx_handle_exit+0x10e/0x790 [kvm_intel]
Dec 06 11:57:07 arif-laptop kernel:  vcpu_enter_guest+0x837/0xf80 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  kvm_arch_vcpu_ioctl_run+0xe0/0x5a0 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  kvm_vcpu_ioctl+0x247/0x5f0 [kvm]
Dec 06 11:57:07 arif-laptop kernel:  __x64_sys_ioctl+0x91/0xc0
Dec 06 11:57:07 arif-laptop kernel:  do_syscall_64+0x38/0x90
Dec 06 11:57:07 arif-laptop kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Dec 06 11:57:07 arif-laptop kernel: RIP: 0033:0x7f3ef1f4c50b
Dec 06 11:57:07 arif-laptop kernel: Code: Unable to access opcode bytes at RIP 0x7f3ef1f4c4e1.
Dec 06 11:57:07 arif-laptop kernel: RSP: 002b:00007f3eead7a5b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Dec 06 11:57:07 arif-laptop kernel: RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f3ef1f4c50b
Dec 06 11:57:07 arif-laptop kernel: RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000011
Dec 06 11:57:07 arif-laptop kernel: RBP: 000055c04dd00650 R08: 000055c04b80f1d0 R09: 0000000000000058
Dec 06 11:57:07 arif-laptop kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Dec 06 11:57:07 arif-laptop kernel: R13: 000055c04be7c080 R14: 0000000000000001 R15: 0000000000000000
Dec 06 11:57:07 arif-laptop kernel: Mem-Info:
Dec 06 11:57:07 arif-laptop kernel: active_anon:1126118 inactive_anon:2736983 isolated_anon:0
                                              active_file:230 inactive_file:217 isolated_file:0
                                              unevictable:4875 dirty:0 writeback:0
                                              slab_reclaimable:27606 slab_unreclaimable:52336
                                              mapped:10773 shmem:157755 pagetables:23456 bounce:0
                                              free:33763 free_pcp:807 free_cma:0
Dec 06 11:57:07 arif-laptop kernel: Node 0 active_anon:4504472kB inactive_anon:10947932kB active_file:920kB inactive_file:868kB unevictable:19500kB isolated(anon):0kB isolated(file):0kB mapped:43092kB dirty:0kB writeback:0kB shmem:631020kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2082816kB writeback_tmp:0kB kernel_stack:22432kB pagetables:93824kB all_unreclaimable? no
Dec 06 11:57:07 arif-laptop kernel: Node 0 DMA free:13820kB min:64kB low:80kB high:96kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15912kB managed:15868kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Dec 06 11:57:07 arif-laptop kernel: lowmem_reserve[]: 0 2106 15649 15649 15649
Dec 06 11:57:07 arif-laptop kernel: Node 0 DMA32 free:63200kB min:9084kB low:11352kB high:13620kB reserved_highatomic:0KB active_anon:673164kB inactive_anon:1464620kB active_file:16kB inactive_file:0kB unevictable:0kB writepending:0kB present:2315200kB managed:2249472kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Dec 06 11:57:07 arif-laptop kernel: lowmem_reserve[]: 0 0 13543 13543 13543
Dec 06 11:57:07 arif-laptop kernel: Node 0 Normal free:58032kB min:58428kB low:73032kB high:87636kB reserved_highatomic:0KB active_anon:3831308kB inactive_anon:9483052kB active_file:580kB inactive_file:572kB unevictable:19500kB writepending:0kB present:14196736kB managed:13877388kB mlocked:32kB bounce:0kB free_pcp:3228kB local_pcp:424kB free_cma:0kB
Dec 06 11:57:07 arif-laptop kernel: lowmem_reserve[]: 0 0 0 0 0
Dec 06 11:57:07 arif-laptop kernel: Node 0 DMA: 3*4kB (U) 2*8kB (U) 4*16kB (U) 1*32kB (U) 2*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 13820kB
Dec 06 11:57:07 arif-laptop kernel: Node 0 DMA32: 241*4kB (UME) 566*8kB (UME) 361*16kB (UME) 229*32kB (UME) 139*64kB (UME) 93*128kB (UME) 59*256kB (UE) 9*512kB (UME) 2*1024kB (ME) 1*2048kB (M) 0*4096kB = 63204kB
Dec 06 11:57:07 arif-laptop kernel: Node 0 Normal: 3271*4kB (UE) 3040*8kB (UE) 908*16kB (UE) 172*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 57436kB
Dec 06 11:57:07 arif-laptop kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Dec 06 11:57:07 arif-laptop kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 06 11:57:07 arif-laptop kernel: 225178 total pagecache pages
Dec 06 11:57:07 arif-laptop kernel: 66920 pages in swap cache
Dec 06 11:57:07 arif-laptop kernel: Swap cache stats: add 1284482, delete 1218001, find 325436/427251
Dec 06 11:57:07 arif-laptop kernel: Free swap  = 0kB
Dec 06 11:57:07 arif-laptop kernel: Total swap = 2097148kB
Dec 06 11:57:07 arif-laptop kernel: 4131962 pages RAM
Dec 06 11:57:07 arif-laptop kernel: 0 pages HighMem/MovableOnly
Dec 06 11:57:07 arif-laptop kernel: 96280 pages reserved
Dec 06 11:57:07 arif-laptop kernel: 0 pages hwpoisoned
Dec 06 11:57:07 arif-laptop kernel: Tasks state (memory values in pages):
Dec 06 11:57:07 arif-laptop kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Dec 06 11:57:07 arif-laptop kernel: [    590]     0   590    17041      211   143360      154          -250 systemd-journal
Dec 06 11:57:07 arif-laptop kernel: [    617]     0   617     6080      828    65536       90         -1000 systemd-udevd
Dec 06 11:57:07 arif-laptop kernel: [   1384]     0  1384     3642       70    61440       61             0 vmware-usbarbit
Dec 06 11:57:07 arif-laptop kernel: [   1420]     0  1420   227512      263   282624     3427             0 multipassd
Dec 06 11:57:07 arif-laptop kernel: [   1433]     0  1433   374199     3428   380928     2253          -999 containerd
Dec 06 11:57:07 arif-laptop kernel: [   1442]     0  1442    59629       93    86016      133             0 boltd
Dec 06 11:57:07 arif-laptop kernel: [   1962]   120  1962    81763      272   147456      249             0 whoopsie
Dec 06 11:57:07 arif-laptop kernel: [   1972]   116  1972     2816       31    65536       88             0 kerneloops
Dec 06 11:57:07 arif-laptop kernel: [   1989]   116  1989     2816       25    61440       94             0 kerneloops
Dec 06 11:57:07 arif-laptop kernel: [   2076]     0  2076      624       17    45056        0             0 bpfilter_umh
Dec 06 11:57:07 arif-laptop kernel: [   2349] 65534  2349     2312       28    61440       65             0 dnsmasq
Dec 06 11:57:07 arif-laptop kernel: [   2488]     0  2488  1302645   370344  4947968   154543             0 qemu-system-x86
Dec 06 11:57:07 arif-laptop kernel: [   2564]  1000  2564     4821      316    77824      274             0 systemd
Dec 06 11:57:07 arif-laptop kernel: [   2565]  1000  2565    42367      258    94208      788             0 (sd-pam)
Dec 06 11:57:07 arif-laptop kernel: [   2571]  1000  2571     5644       79    81920     1701             0 powerline-daemon
Dec 06 11:57:07 arif-laptop kernel: [   2574]  1000  2574   167280     3065   225280     3402             0 tracker-miner-f
Dec 06 11:57:07 arif-laptop kernel: [   2578]  1000  2578     1921      174    53248       58             0 dbus-daemon
Dec 06 11:57:07 arif-laptop kernel: [   6185]     0  6185   269127      349   204800       17          -998 containerd-shim
Dec 06 11:57:07 arif-laptop kernel: [   6326]  1000  6326   448914     1611   348160     2650             0 docker
Dec 06 11:57:07 arif-laptop kernel: [  17794]     0 17794  1488133   460314  5279744    69394             0 qemu-system-x86
Dec 06 11:57:07 arif-laptop kernel: [  17979]  1000 17979    93756      117   192512      490             0 multipass
Dec 06 11:57:07 arif-laptop kernel: [  62363]   133 62363  2390613  2039521 16625664     1250             0 java
Dec 06 11:57:07 arif-laptop kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/elasticsearch.service,task=java,pid=62363,uid=133
Dec 06 11:57:07 arif-laptop kernel: Out of memory: Killed process 62363 (java) total-vm:9562452kB, anon-rss:8158084kB, file-rss:0kB, shmem-rss:0kB, UID:133 pgtables:16236kB oom_score_adj:0
Dec 06 11:57:07 arif-laptop kernel: oom_reaper: reaped process 62363 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Dec 06 11:57:07 arif-laptop systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
Dec 06 11:57:07 arif-laptop systemd[1]: elasticsearch.service: Failed with result 'signal'.
Dec 06 11:57:07 arif-laptop systemd[1]: Failed to start Elasticsearch.

What is your config, both Elasticsearch and JVM?
What is the output from the _cluster/stats?pretty&human API?

It's a fresh installation so nothing much,

β•°β”€β”€βž€ cat /etc/elasticsearch/elasticsearch.yml | grep -Ev "^#"
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
β•°β”€β”€βž€cat /etc/java-11-openjdk/jvm-amd64.cfg 
-server KNOWN
-client IGNORE
-zero KNOWN
-dcevm KNOWN

Can't start Elasticsearch to check the api stats.

You probably need more heap then, that's where I would start.

I think it's more that you need less heap. By default Elasticsearch assumes it's the only significant service running and therefore all of the resources it sees are available for its use. If that's not the case then the oom-killer will step in as we see here. The solution is either to run it in isolation or configure it to use fewer resources, in particular to reduce its heap size.

(NB this isn't a memory leak)

1 Like

That did occur to me, more setting the heap rather than being dynamic. But given it's docker I figured increasing it would show the same end result.

Yeah, I mention the heapsize in the systemd configuration it consumes only the defined memory.
But I have never observed this behavior before. For logstash I do not have to configure anything but by default it configures 1G as heap.

Thanks for the help. I guess I can close this one.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.