ES 6.4.2 clients Out of Memory Java Heap Space

[2019-11-02T01:09:15,188][WARN ][i.n.c.AbstractChannelHandlerContext] An exception 'java.lang.OutOfMemoryError: Java heap space' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handli
ng the following exception:
java.lang.OutOfMemoryError: Java heap space
[2019-11-02T00:12:26,010][INFO ][s.e.p.o.a.z.ClientCnxn   ] Opening socket connection to server zookeeper.us2.svc.cluster.local/10.10.11.183:2181. Will not attempt to authenticate using SASL (unknown error)
[2019-11-02T00:42:04,395][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [metabase-es-client-74cd9fcbdb-qbxxf] fatal error in thread [Thread-6], exiting
java.lang.OutOfMemoryError: Java heap space
[2019-11-02T01:09:58,336][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [metabase-es-client-74cd9fcbdb-qbxxf] fatal error in thread [elasticsearch[metabase-es-client-74cd9fcbdb-qbxxf][management][T#4]], exiting
java.lang.OutOfMemoryError: Java heap space
[2019-11-02T01:50:08,151][WARN ][i.n.c.AbstractChannelHandlerContext] An exception 'java.lang.OutOfMemoryError: Java heap space' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handli
ng the following exception:
java.lang.OutOfMemoryError: Java heap space
[2019-11-02T01:18:38,245][WARN ][o.e.m.j.JvmGcMonitorService] [metabase-es-client-74cd9fcbdb-qbxxf] [gc][old][102][238] duration [2.5h], collections [220]/[2.5h], total [2.5h]/[2.7h], memory [11.9gb]->[11.9gb]/[11.9gb], all_pools {[young]
 [399.4mb]->[399.4mb]/[399.4mb]}{[survivor] [48.9mb]->[49.8mb]/[49.8mb]}{[old] [11.5gb]->[11.5gb]/[11.5gb]}
[2019-11-02T00:11:04,219][WARN ][i.n.c.AbstractChannelHandlerContext] An exception 'java.lang.InternalError: linkToTargetMethod=Lambda(a0:L)=>{
    t1:L=MethodHandle.invokeBasic(a0:L);t1:L}' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handling the following exception:
io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Java heap space
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:459) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.16.Final.jar:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Caused by: java.lang.OutOfMemoryError: Java heap space

I have been seeing our clients nodes crash multiple times. The above exception is popping up. Upon searching the interwebs, it pointed to netty memory leaks. Could some one help with this?

Extra logs after the above exception

 [2019-11-02T01:18:38,245][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [metabase-es-client-74cd9fcbdb-qbxxf] fatal error in thread [elasticsearch[metabase-es-client-74cd9fcbdb-qbxxf][management][T#2]], exiting
    java.lang.OutOfMemoryError: Java heap space
    [2019-11-02T01:50:48,964][WARN ][o.e.m.j.JvmGcMonitorService] [metabase-es-client-74cd9fcbdb-qbxxf] [gc][102] overhead, spent [2.5h] collecting in the last [2.5h]
    [2019-11-02T01:50:48,964][INFO ][s.e.p.o.a.z.ZooKeeper    ] Session: 0x0 closed
    [2019-11-02T01:50:48,965][INFO ][s.e.p.o.a.z.ZooKeeper    ] Initiating client connection, connectString=zookeeper.us2.svc.cluster.local:2181 sessionTimeout=60000 watcher=sf.elasticsearch.plugin.org.apache.curator.ConnectionState@7ee17510
    [2019-11-02T01:50:08,152][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [metabase-es-client-74cd9fcbdb-qbxxf] fatal error in thread [Thread-9], exiting
    java.lang.OutOfMemoryError: Java heap space
    [2019-11-02T01:49:14,729][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [metabase-es-client-74cd9fcbdb-qbxxf] fatal error in thread [Thread-11], exiting
    java.lang.OutOfMemoryError: Java heap space
    [2019-11-02T01:10:41,602][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [metabase-es-client-74cd9fcbdb-qbxxf] fatal error in thread [elasticsearch[metabase-es-client-74cd9fcbdb-qbxxf][management][T#5]], exiting
    java.lang.OutOfMemoryError: Java heap space
    [2019-11-02T01:15:35,175][ERROR][o.e.t.n.Netty4Utils      ] fatal error on the network layer
            at org.elasticsearch.transport.netty4.Netty4Utils.maybeDie(Netty4Utils.java:182)
            at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.exceptionCaught(Netty4MessageChannelHandler.java:73)
            at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285)
            at io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:850)
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:364)
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
            at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
            at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)

Oh and just to clarify, I understand out of memory due to heap space typically means that I need to provision more heap. I have been increasing the heap size from 3G, 6G and now 12G. We have 9 of these nodes running. The heap just keeps on increasing and GC kicks or the node crashes just around the point it touches 12G. This makes me believe that there is a memory leak. I have taken a heap dump during the constant increase. Let me know if anyone would want some info from it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.