OutOfDirectMemoryError Crush loop

I keep facing these error right after couple of minutes of running and watch how the actual mem usage of the logstash container explodes and hangs like that until I restart it manually.

[2019-04-19T11:11:19,778][INFO ][org.logstash.beats.BeatsHandler] [local: 192.168.3.178:5044, remote: x.x.x.x:52110] Handling exception: failed to allocate 16777216 byte(s) of direct memory (used: 8589934592, max: 8589934592)
[2019-04-19T11:11:19,778][WARN ][io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 8589934592, max: 8589934592)
at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:640) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:594) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:764) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:740) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:244) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:226) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.PoolArena.reallocate(PoolArena.java:397) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:118) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.AbstractByteBuf.ensureWritable0(AbstractByteBuf.java:285) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:265) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1079) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1072) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1062) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:38) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:353) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66) ~[netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-all-4.1.18.Final.jar:4.1.18.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.18.Final.jar:4.1.18.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]

I am ingesting logs from around ~40 beats and I have around ~50 filter plugins (around 50000 docs/minute), but a single doc usually goes through max 2 filter plugins and is eventually sent to ES. I have tried setting JVM heap from -Xmx 4GB to 16GB as well as the -XX:MaxDirectMemorySize from 16M to 16G. Of course, the bigger the Heap size, the more time it takes to crush, but eventually, it doesn't matter what the setting of the JVM are, I end up facing this. Also the container is not mem limited nor the machine is out of resources.

Any idea on how to resolve this issue? The longer I wait, the more logs are piling on and creating even more pressure to the Logstash instance in order to catch up.

The whole ELK stack is v6.3.2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.