Hi,
We have a 8core/16GB ( 4 nodes cluster ) for the Elasticsearch and the Elasticsearch process is getting killed.
JDK settings
-Xms8g -Xmx8g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false
Elasticsearch settings:
node.master: true
node.data: true
thread_pool.bulk.queue_size: 50000
thread_pool.search.size: 200
There are error logs:
[2023-04-24T18:59:16,595][WARN ][o.e.m.j.JvmGcMonitorService] [web-49-102-hzifc.node.hzifc.wacai.sdc] [gc][2095859] overhead, spent [14.3s] collecting in the last [15.1s]
[2023-04-24T18:59:33,643][WARN ][o.e.m.j.JvmGcMonitorService] [web-49-102-hzifc.node.hzifc.wacai.sdc] [gc][old][2095860][2952] duration [16.8s], collections [1]/[17s], total [16.8s]/[22.2m], memory [7.6gb]->[7.7gb]/[7.9gb], all_pools {[young] [353.9mb]->[420.5mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2023-04-24T19:01:24,919][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [web-49-102-hzifc.node.hzifc.wacai.sdc] fatal error in thread [Thread-1136062], exiting
java.lang.OutOfMemoryError: Java heap space
at io.netty.buffer.UnpooledHeapByteBuf.allocateArray(UnpooledHeapByteBuf.java:88) ~[?:?]
at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledHeapByteBuf.allocateArray(UnpooledByteBufAllocator.java:164) ~[?:?]
at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteBuf.java:61) ~[?:?]
at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledHeapByteBuf.<init>(UnpooledByteBufAllocator.java:159) ~[?:?]
at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(UnpooledByteBufAllocator.java:82) ~[?:?]
at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:166) ~[?:?]
at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:157) ~[?:?]
at io.netty.buffer.Unpooled.buffer(Unpooled.java:116) ~[?:?]
at io.netty.buffer.Unpooled.copiedBuffer(Unpooled.java:409) ~[?:?]
at org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:73) ~[?:?]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at org.elasticsearch.http.netty4.pipelining.HttpPipeliningHandler.channelRead(HttpPipeliningHandler.java:68) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at org.elasticsearch.http.netty4.cors.Netty4CorsHandler.channelRead(Netty4CorsHandler.java:86) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) ~[?:?]
at io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) ~[?:?]
pass by heap file , we find there is a lot of memory footprint by netty's io.netty.buffer.PoolChunk object
Can you advice whether any other settings to be made to avoid the OOM killer to crash the elasticsearch process.
Thanks