Out of memory: Kill process


(Chen Wei Hsu) #1

Hi,
I have a dev server running elasticsearch 2.3.5, kibana 4.5 on ubuntu 16.04.1 LTS. Originally, this server has only 1G RAM, but it is constantly killed due to OOM. This forces me to add more memories to the VM. Now it has 4G memory, but it still get killed. I think something must be wrong with the settings because my production server which uses 4G memory never get OOM. Though the production server is ubuntu 14.04LTS. I am still confused about what's going wrong.

https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.x/setting-system-settings.html#sysconfig
https://www.elastic.co/guide/en/elasticsearch/reference/2.1/setup-configuration.html#setup-configuration-memory

According to the above links, I made the following configurations:
/etc/default/elasticsearch
ES_HEAP_SIZE=2g
ES_JAVA_OPTS="-Xms2g -Xmx2g"
MAX_OPEN_FILES=65535
MAX_LOCKED_MEMORY=unlimited

/etc/elasticsearch/elasticsearch.yml
bootstrap.mlockall: true

/usr/lib/systemd/system/elasticsearch.service
LimitMEMLOCK=infinity

/etc/fstab
comment swap directly.

And I am still getting OOM kill messages in syslog everyday. It's very annoying to restart your elasticsearch service everyday. How could I fix this?

The followings are the OOM messages:
Nov 10 18:03:24 elk2dev kernel: [204872.040150] [ 1067] 0 1067 1431 884 8 3 0 -17 iscsid
Nov 10 18:03:24 elk2dev kernel: [204872.040152] [ 1121] 0 1121 4673 362 13 3 0 0 agetty
Nov 10 18:03:24 elk2dev kernel: [204872.040154] [ 1138] 0 1138 69272 178 39 3 0 0 polkitd
Nov 10 18:03:24 elk2dev kernel: [204872.040156] [ 3754] 111 3754 1059531 594141 1228 7 0 0 java
Nov 10 18:03:24 elk2dev kernel: [204872.040158] Out of memory: Kill process 3754 (java) score 588 or sacrifice child
Nov 10 18:03:24 elk2dev kernel: [204872.040285] Killed process 3754 (java) total-vm:4238124kB, anon-rss:2350724kB, file-rss:25840kB
Nov 10 18:03:25 elk2dev systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
Nov 10 18:03:25 elk2dev kibana[889]: {"type":"log","@timestamp":"2016-11-10T10:03:25+00:00","tags":["error","elasticsearch"],"pid":889,"message":"Request error, retrying -- connect ECONNREFUSED 127.0.0.1:9200"}
Nov 10 18:03:25 elk2dev kibana[889]: {"type":"log","@timestamp":"2016-11-10T10:03:25+00:00","tags":["warning","elasticsearch"],"pid":889,"message":"Unable to revive connection: http://localhost:9200/"}
Nov 10 18:03:25 elk2dev kibana[889]: {"type":"log","@timestamp":"2016-11-10T10:03:25+00:00","tags":["warning","elasticsearch"],"pid":889,"message":"No living connections"}
Nov 10 18:03:25 elk2dev systemd[1]: elasticsearch.service: Unit entered failed state.
Nov 10 18:03:25 elk2dev systemd[1]: elasticsearch.service: Failed with result 'signal'.
Nov 10 18:03:25 elk2dev kibana[889]: {"type":"log","@timestamp":"2016-11-10T10:03:25+00:00","tags":["status","plugin:elasticsearch","error"],"pid":889,"name":"plugin:elasticsearch","state":"red","message":"Status changed from red to red - No Living connections","prevState":"red","prevMsg":"Request Timeout after 3000ms"}
Nov 10 18:03:28 elk2dev kibana[889]: {"type":"log","@timestamp":"2016-11-10T10:03:28+00:00","tags":["warning","elasticsearch"],"pid":889,"message":"Unable to revive connection: http://localhost:9200/"}
Nov 10 18:03:28 elk2dev kibana[889]: {"type":"log","@timestamp":"2016-11-10T10:03:28+00:00","tags":["warning","elasticsearch"],"pid":889,"message":"No living connections"}
Nov 10 18:03:28 elk2dev kibana[889]: {"type":"log","@timestamp":"2016-11-10T10:03:28+00:00","tags":["status","plugin:elasticsearch","error"],"pid":889,"name":"plugin:elasticsearch","state":"red","message":"Status changed from red to red - Unable to connect to Elasticsearch at http://localhost:9200.","prevState":"red","prevMsg":"No Living connections"}


Elasticsearch 5.2.2 - Out of memory - while indexing
(Mark Walkom) #2

You should disable the OSs OOM killer.


(Chen Wei Hsu) #3

Okey, based on your suggestions, I find some suggestions about memory over commit

/etc/sysctl.conf
vm.overcommit_memory = 2
// reload
sysctl -p

I already made the change, I will accept the answer if my dev server doesn't crash by OOM again. Thanks!


(Chen Wei Hsu) #4

I could not start elasticsearch now, and I get this error message

Nov 12 16:54:08 elk2dev systemd[1]: Starting Elasticsearch...
Nov 12 16:54:08 elk2dev systemd[1]: Started Elasticsearch.
Nov 12 16:54:08 elk2dev elasticsearch[1306]: Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000085330000, 2060255232, 0) failed; error='Cannot allocate memory' (errno=12)
Nov 12 16:54:08 elk2dev elasticsearch[1306]: #
Nov 12 16:54:08 elk2dev elasticsearch[1306]: # There is insufficient memory for the Java Runtime Environment to continue.
Nov 12 16:54:08 elk2dev elasticsearch[1306]: # Native memory allocation (mmap) failed to map 2060255232 bytes for committing reserved memory.
Nov 12 16:54:08 elk2dev elasticsearch[1306]: # An error report file with more information is saved as:
Nov 12 16:54:08 elk2dev elasticsearch[1306]: # /tmp/hs_err_pid1306.log
Nov 12 16:54:08 elk2dev systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
Nov 12 16:54:08 elk2dev systemd[1]: elasticsearch.service: Unit entered failed state.
Nov 12 16:54:08 elk2dev systemd[1]: elasticsearch.service: Failed with result 'exit-code'.

/tmp/hs_err_pid1306.log

There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (mmap) failed to map 2060255232 bytes for committing reserved memory.
Possible reasons:
The system is out of physical RAM or swap space
In 32 bit mode, the process size limit was hit
Possible solutions:
Reduce memory load on the system
Increase physical memory or swap space
Check if swap backing store is full
Use 64 bit Java on a 64 bit OS
Decrease Java heap size (-Xmx/-Xms)
Decrease number of Java threads
Decrease Java thread stack sizes (-Xss)
Set larger code cache with -XX:ReservedCodeCacheSize=
This output file may be truncated or incomplete.

Out of Memory Error (os_linux.cpp:2627), pid=1306, tid=0x00007f0dd209f700

JRE version: (8.0_111-b14) (build )
Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

--------------- T H R E A D ---------------

Current thread (0x00007f0dcc00b000): JavaThread "Unknown thread" [_thread_in_vm, id=1323, stack(0x00007f0dd1f9f000,0x00007f0dd20a0000)]

Stack: [0x00007f0dd1f9f000,0x00007f0dd20a0000], sp=0x00007f0dd209e370, free space=1020k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0xac5c2a] VMError::report_and_die()+0x2ba
V [libjvm.so+0x4fc50b] report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b
V [libjvm.so+0x922ae3] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0x103
V [libjvm.so+0x923039] os::pd_commit_memory(char*, unsigned long, unsigned long, bool)+0x29
V [libjvm.so+0x91d33a] os::commit_memory(char*, unsigned long, unsigned long, bool)+0x2a
V [libjvm.so+0xac1989] VirtualSpace::expand_by(unsigned long, bool)+0x199
V [libjvm.so+0xac24de] VirtualSpace::initialize(ReservedSpace, unsigned long)+0xee
V [libjvm.so+0x5f9e61] CardGeneration::CardGeneration(ReservedSpace, unsigned long, int, GenRemSet*)+0xf1
V [libjvm.so+0x4e5c2e] ConcurrentMarkSweepGeneration::ConcurrentMarkSweepGeneration(ReservedSpace, unsigned long, int, CardTableRS*, bool, FreeBlockDictionary::DictionaryChoice)+0x4e
V [libjvm.so+0x5faf22] GenerationSpec::init(ReservedSpace, int, GenRemSet*)+0xf2
V [libjvm.so+0x5e9d5e] GenCollectedHeap::initialize()+0x1de
V [libjvm.so+0xa8dd53] Universe::initialize_heap()+0xf3
V [libjvm.so+0xa8e2be] universe_init()+0x3e
V [libjvm.so+0x63c925] init_globals()+0x65
V [libjvm.so+0xa719be] Threads::create_vm(JavaVMInitArgs*, bool*)+0x23e
V [libjvm.so+0x6d11c4] JNI_CreateJavaVM+0x74
C [libjli.so+0x745e] JavaMain+0x9e
C [libpthread.so.0+0x770a] start_thread+0xca


(Chen Wei Hsu) #5

Does anyone know what is the problem of my configuration?


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.