Out of direct memory on indexing


(Maxim Valyanskiy) #1

Hello!

We are running performance benchmark on ElasticSearch and after a while we
see following out of memory exception:

[2012-10-10 17:00:19,728][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x522b2996, /0:0:0:0:0:0:0:1:38110 :> /0:0:0:0:0:0:0:1:9200]
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.write(IOUtil.java:53)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:205)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:494)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:449)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:342)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:367)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:260)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[2012-10-10 17:00:36,260][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x6de387f9, /0:0:0:0:0:0:0:1:38112 :> /0:0:0:0:0:0:0:1:9200]

We are using default on disk index (not in memory index), ElasticSearch is
run with ES_HEAP_SIZE=10g. Resident memory size of process are about 20g
when we see that problem. It looks like some direct buffer leak somewhere
in network layer.

Data set is about 104Gb of text, 2 nodes cluster. We are running only basic
text query searches (no faceting & etc), and we are using store
compression. Text uploader runs in 16 threads and uses bulk requests;
http.max_content_length: 500m

Maybe someone has idea what is wrong?

Maxim

--


(Shay Banon) #2

This failure comes from the networking library in failing to allocate direct memory for the networking buffers. Which version of elasticsearch are you using? Also, when you are indexing, are you using the bulk request? If so, are you controlling how big the bulk request is?

Last, how are you interacting with elasticsearch? I mean, the library that you use to issue HTTP requests.

On Oct 11, 2012, at 12:59 AM, Maxim Valyanskiy max.valjanski@gmail.com wrote:

Hello!

We are running performance benchmark on ElasticSearch and after a while we see following out of memory exception:

[2012-10-10 17:00:19,728][WARN ][http.netty ] [welsung] Caught exception while handling client http traffic, closing connection [id: 0x522b2996, /0:0:0:0:0:0:0:1:38110 :> /0:0:0:0:0:0:0:1:9200]
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.write(IOUtil.java:53)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:205)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:494)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:449)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:342)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:367)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:260)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[2012-10-10 17:00:36,260][WARN ][http.netty ] [welsung] Caught exception while handling client http traffic, closing connection [id: 0x6de387f9, /0:0:0:0:0:0:0:1:38112 :> /0:0:0:0:0:0:0:1:9200]

We are using default on disk index (not in memory index), ElasticSearch is run with ES_HEAP_SIZE=10g. Resident memory size of process are about 20g when we see that problem. It looks like some direct buffer leak somewhere in network layer.

Data set is about 104Gb of text, 2 nodes cluster. We are running only basic text query searches (no faceting & etc), and we are using store compression. Text uploader runs in 16 threads and uses bulk requests; http.max_content_length: 500m

Maybe someone has idea what is wrong?

Maxim

--

--


(Jeffrey Gerard) #3

We're seeing nearly the same stacktrace (below) on a shiny new
installation, even before indexing anything, nor having created an index.
It seems triggered by issuing a just few GETs to the :9200/_cluster/nodes
endpoint from a web browser.

I can reproduce this even on a 1-node "cluster", running v0.19.10 with the
cloud-aws 1.9.0 plugin. We've also tried values for ES_HEAP_SIZE ranging
from 1g to 6g and ES_DIRECT_SIZE from the 64m default as high as 1g. The
machines have 7.5g RAM so we aren't hitting that limit.

[2012-10-12 21:07:27,185][WARN
][netty.channel.socket.nio.AbstractNioWorker] Unexpected exception in the
selector loop.
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.newBuffer(SocketReceiveBufferAllocator.java:62)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.get(SocketReceiveBufferAllocator.java:41)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:57)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

On Thursday, October 11, 2012 8:00:58 AM UTC-7, kimchy wrote:

This failure comes from the networking library in failing to allocate
direct memory for the networking buffers. Which version of elasticsearch
are you using? Also, when you are indexing, are you using the bulk request?
If so, are you controlling how big the bulk request is?

Last, how are you interacting with elasticsearch? I mean, the library that
you use to issue HTTP requests.

On Oct 11, 2012, at 12:59 AM, Maxim Valyanskiy <max.va...@gmail.com<javascript:>>
wrote:

Hello!

We are running performance benchmark on ElasticSearch and after a while we
see following out of memory exception:

[2012-10-10 17:00:19,728][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x522b2996, /0:0:0:0:0:0:0:1:38110 :> /0:0:0:0:0:0:0:1:9200]
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.write(IOUtil.java:53)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:205)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:494)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:449)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:342)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:367)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:260)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[2012-10-10 17:00:36,260][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x6de387f9, /0:0:0:0:0:0:0:1:38112 :> /0:0:0:0:0:0:0:1:9200]

We are using default on disk index (not in memory index), ElasticSearch is
run with ES_HEAP_SIZE=10g. Resident memory size of process are about 20g
when we see that problem. It looks like some direct buffer leak somewhere
in network layer.

Data set is about 104Gb of text, 2 nodes cluster. We are running only
basic text query searches (no faceting & etc), and we are using store
compression. Text uploader runs in 16 threads and uses bulk requests;
http.max_content_length: 500m

Maybe someone has idea what is wrong?

Maxim

--

--


(Chris Male) #4

Jeffrey,

On Saturday, October 13, 2012 10:46:02 AM UTC+13, Jeffrey Gerard wrote:

We're seeing nearly the same stacktrace (below) on a shiny new
installation, even before indexing anything, nor having created an index.
It seems triggered by issuing a just few GETs to the :9200/_cluster/nodes
endpoint from a web browser.

Let me just clarify, you've got no indexed content and in fact no indexes
at all?

I can reproduce this even on a 1-node "cluster", running v0.19.10 with the
cloud-aws 1.9.0 plugin.

That does sound worrying. If you use a former version, such as 0.19.9,
does it happen still?

We've also tried values for ES_HEAP_SIZE ranging from 1g to 6g and
ES_DIRECT_SIZE from the 64m default as high as 1g. The machines have 7.5g
RAM so we aren't hitting that limit.

[2012-10-12 21:07:27,185][WARN
][netty.channel.socket.nio.AbstractNioWorker] Unexpected exception in the
selector loop.
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.newBuffer(SocketReceiveBufferAllocator.java:62)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.get(SocketReceiveBufferAllocator.java:41)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:57)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

On Thursday, October 11, 2012 8:00:58 AM UTC-7, kimchy wrote:

This failure comes from the networking library in failing to allocate
direct memory for the networking buffers. Which version of elasticsearch
are you using? Also, when you are indexing, are you using the bulk request?
If so, are you controlling how big the bulk request is?

Last, how are you interacting with elasticsearch? I mean, the library
that you use to issue HTTP requests.

On Oct 11, 2012, at 12:59 AM, Maxim Valyanskiy max.va...@gmail.com
wrote:

Hello!

We are running performance benchmark on ElasticSearch and after a while
we see following out of memory exception:

[2012-10-10 17:00:19,728][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x522b2996, /0:0:0:0:0:0:0:1:38110 :> /0:0:0:0:0:0:0:1:9200]
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.write(IOUtil.java:53)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:205)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:494)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:449)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:342)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:367)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:260)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[2012-10-10 17:00:36,260][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x6de387f9, /0:0:0:0:0:0:0:1:38112 :> /0:0:0:0:0:0:0:1:9200]

We are using default on disk index (not in memory index), ElasticSearch
is run with ES_HEAP_SIZE=10g. Resident memory size of process are about 20g
when we see that problem. It looks like some direct buffer leak somewhere
in network layer.

Data set is about 104Gb of text, 2 nodes cluster. We are running only
basic text query searches (no faceting & etc), and we are using store
compression. Text uploader runs in 16 threads and uses bulk requests;
http.max_content_length: 500m

Maybe someone has idea what is wrong?

Maxim

--

--


(Jeffrey Gerard) #5

Hi Chris,
What you said to confirm is true.
At the time I was running as a service using the deb download from the ES
website. To work around this, I instead downloaded the tar.gz and launched
elasticsearch on the command line. This doesn't explain much, but I
suspect there's some magic in the service init script that was breaking
this or setting too-high memory limits. Anyway at this point I've got it
working (without the deb) and it's doing great.

Thanks for your help.

On Saturday, October 13, 2012 12:18:03 AM UTC-7, Chris Male wrote:

Jeffrey,

On Saturday, October 13, 2012 10:46:02 AM UTC+13, Jeffrey Gerard wrote:

We're seeing nearly the same stacktrace (below) on a shiny new
installation, even before indexing anything, nor having created an index.
It seems triggered by issuing a just few GETs to the :9200/_cluster/nodes
endpoint from a web browser.

Let me just clarify, you've got no indexed content and in fact no indexes
at all?

I can reproduce this even on a 1-node "cluster", running v0.19.10 with
the cloud-aws 1.9.0 plugin.

That does sound worrying. If you use a former version, such as 0.19.9,
does it happen still?

We've also tried values for ES_HEAP_SIZE ranging from 1g to 6g and
ES_DIRECT_SIZE from the 64m default as high as 1g. The machines have 7.5g
RAM so we aren't hitting that limit.

[2012-10-12 21:07:27,185][WARN
][netty.channel.socket.nio.AbstractNioWorker] Unexpected exception in the
selector loop.
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.newBuffer(SocketReceiveBufferAllocator.java:62)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.get(SocketReceiveBufferAllocator.java:41)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:57)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

On Thursday, October 11, 2012 8:00:58 AM UTC-7, kimchy wrote:

This failure comes from the networking library in failing to allocate
direct memory for the networking buffers. Which version of elasticsearch
are you using? Also, when you are indexing, are you using the bulk request?
If so, are you controlling how big the bulk request is?

Last, how are you interacting with elasticsearch? I mean, the library
that you use to issue HTTP requests.

On Oct 11, 2012, at 12:59 AM, Maxim Valyanskiy max.va...@gmail.com
wrote:

Hello!

We are running performance benchmark on ElasticSearch and after a while
we see following out of memory exception:

[2012-10-10 17:00:19,728][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x522b2996, /0:0:0:0:0:0:0:1:38110 :> /0:0:0:0:0:0:0:1:9200]
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.write(IOUtil.java:53)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:205)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:494)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:449)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:342)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:367)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:260)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[2012-10-10 17:00:36,260][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x6de387f9, /0:0:0:0:0:0:0:1:38112 :> /0:0:0:0:0:0:0:1:9200]

We are using default on disk index (not in memory index), ElasticSearch
is run with ES_HEAP_SIZE=10g. Resident memory size of process are about 20g
when we see that problem. It looks like some direct buffer leak somewhere
in network layer.

Data set is about 104Gb of text, 2 nodes cluster. We are running only
basic text query searches (no faceting & etc), and we are using store
compression. Text uploader runs in 16 threads and uses bulk requests;
http.max_content_length: 500m

Maybe someone has idea what is wrong?

Maxim

--

--


(simonw-2) #6

hey Jeffrey,

does your dep package lock the memory? Can you try to figure out which
setting triggers the behavior or send me a diff?

simon

On Tuesday, October 16, 2012 1:29:43 AM UTC+2, Jeffrey Gerard wrote:

Hi Chris,
What you said to confirm is true.
At the time I was running as a service using the deb download from the ES
website. To work around this, I instead downloaded the tar.gz and launched
elasticsearch on the command line. This doesn't explain much, but I
suspect there's some magic in the service init script that was breaking
this or setting too-high memory limits. Anyway at this point I've got it
working (without the deb) and it's doing great.

Thanks for your help.

On Saturday, October 13, 2012 12:18:03 AM UTC-7, Chris Male wrote:

Jeffrey,

On Saturday, October 13, 2012 10:46:02 AM UTC+13, Jeffrey Gerard wrote:

We're seeing nearly the same stacktrace (below) on a shiny new
installation, even before indexing anything, nor having created an index.
It seems triggered by issuing a just few GETs to the :9200/_cluster/nodes
endpoint from a web browser.

Let me just clarify, you've got no indexed content and in fact no indexes
at all?

I can reproduce this even on a 1-node "cluster", running v0.19.10 with
the cloud-aws 1.9.0 plugin.

That does sound worrying. If you use a former version, such as 0.19.9,
does it happen still?

We've also tried values for ES_HEAP_SIZE ranging from 1g to 6g and
ES_DIRECT_SIZE from the 64m default as high as 1g. The machines have 7.5g
RAM so we aren't hitting that limit.

[2012-10-12 21:07:27,185][WARN
][netty.channel.socket.nio.AbstractNioWorker] Unexpected exception in the
selector loop.
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.newBuffer(SocketReceiveBufferAllocator.java:62)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketReceiveBufferAllocator.get(SocketReceiveBufferAllocator.java:41)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:57)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

On Thursday, October 11, 2012 8:00:58 AM UTC-7, kimchy wrote:

This failure comes from the networking library in failing to allocate
direct memory for the networking buffers. Which version of elasticsearch
are you using? Also, when you are indexing, are you using the bulk request?
If so, are you controlling how big the bulk request is?

Last, how are you interacting with elasticsearch? I mean, the library
that you use to issue HTTP requests.

On Oct 11, 2012, at 12:59 AM, Maxim Valyanskiy max.va...@gmail.com
wrote:

Hello!

We are running performance benchmark on ElasticSearch and after a while
we see following out of memory exception:

[2012-10-10 17:00:19,728][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x522b2996, /0:0:0:0:0:0:0:1:38110 :> /0:0:0:0:0:0:0:1:9200]
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.write(IOUtil.java:53)
at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at
org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:205)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:494)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:449)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:342)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:367)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:260)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[2012-10-10 17:00:36,260][WARN ][http.netty ] [welsung]
Caught exception while handling client http traffic, closing connection
[id: 0x6de387f9, /0:0:0:0:0:0:0:1:38112 :> /0:0:0:0:0:0:0:1:9200]

We are using default on disk index (not in memory index), ElasticSearch
is run with ES_HEAP_SIZE=10g. Resident memory size of process are about 20g
when we see that problem. It looks like some direct buffer leak somewhere
in network layer.

Data set is about 104Gb of text, 2 nodes cluster. We are running only
basic text query searches (no faceting & etc), and we are using store
compression. Text uploader runs in 16 threads and uses bulk requests;
http.max_content_length: 500m

Maybe someone has idea what is wrong?

Maxim

--

--


(system) #7