Hello!
We are running performance benchmark on ElasticSearch and after a while we
see following out of memory exception:
[2012-10-10 17:00:19,728][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x522b2996, /0:0:0:0:0:0:0:1:38110 :> /0:0:0:0:0:0:0:1:9200]
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.write(IOUtil.java:53)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:205)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:494)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:449)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:342)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:367)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:260)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[2012-10-10 17:00:36,260][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x6de387f9, /0:0:0:0:0:0:0:1:38112 :> /0:0:0:0:0:0:0:1:9200]
We are using default on disk index (not in memory index), ElasticSearch is
run with ES_HEAP_SIZE=10g. Resident memory size of process are about 20g
when we see that problem. It looks like some direct buffer leak somewhere
in network layer.
Data set is about 104Gb of text, 2 nodes cluster. We are running only basic
text query searches (no faceting & etc), and we are using store
compression. Text uploader runs in 16 threads and uses bulk requests;
http.max_content_length: 500m
Maybe someone has idea what is wrong?
Maxim
--