Bug: memory leak while scrolling over index

Ralf_Schmitt · July 15, 2013, 11:22am

Hi all,

I'm trying to scroll over all documents of an ElasticSearch Index using
a match_all query. I've set ES_HEAP_SIZE to 8G but I'm not able to
complete the operation, because elasticsearch runs out of memory (see
log at the end of this mail).

The head plugin tells me the index size is around 400GB with around 9.5M
documents. I'm using a single document type with the following mapping:

{
"src_doc": {
"_all": {
"enabled": false
},
"_source": {
"enabled": false
},
"properties": {
"content": {
"type": "binary"
},
"exception": {
"index": "no",
"store": true,
"type": "string"
},
"last_update": {
"format": "YYYY-MM-dd",
"store": true,
"type": "date"
},
"title": {
"index": "no",
"store": true,
"type": "string"
},
"uid": {
"index": "no",
"type": "string"
},
"url": {
"index": "no",
"store": true,
"type": "string"
}
}
}
}

I'm using ElasticSearch 0.90.2, but the issue has already been in
0.90.0. The Cluster is a single node, which is not being used otherwise.

Here's the log:

2013-07-12T13:59:39.40468 java.lang.OutOfMemoryError: Java heap space
2013-07-12T13:59:39.43523 Dumping heap to java_pid8908.hprof ...
2013-07-12T14:00:14.69555 Heap dump file created [8513498041 bytes in 35.278 secs]
2013-07-12T14:00:14.73700 [2013-07-12 16:00:14,702][WARN ][http.netty ] [graph.8908] Caught exception while handling client http traffic, closing connection [id: 0x8d823a3d, /127.0.0.1:52874 => /127.0.0.1:9250]
2013-07-12T14:00:14.73702 java.lang.OutOfMemoryError: Java heap space
2013-07-12T14:00:14.73702 at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
2013-07-12T14:00:14.73702 at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
2013-07-12T14:00:14.73702 at org.elasticsearch.common.netty.buffer.CompositeChannelBuffer.toByteBuffer(CompositeChannelBuffer.java:649)
2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.buffer.AbstractChannelBuffer.toByteBuffer(AbstractChannelBuffer.java:530)
2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:77)
2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:46)
2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:194)
2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:152)
2013-07-12T14:00:14.73704 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:335)
2013-07-12T14:00:14.73704 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
2013-07-12T14:00:14.73705 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
2013-07-12T14:00:14.73706 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
2013-07-12T14:00:14.73707 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
2013-07-12T14:00:14.73707 at java.lang.Thread.run(Thread.java:722)
2013-07-12T14:00:18.29386 [2013-07-12 16:00:18,293][WARN ][http.netty ] [graph.8908] Caught exception while handling client http traffic, closing connection [id: 0xd2b513bc, /127.0.0.1:52875 => /127.0.0.1:9250]
2013-07-12T14:00:18.29388 java.lang.OutOfMemoryError: Java heap space
2013-07-12T14:00:18.29388 at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
2013-07-12T14:00:18.29388 at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
2013-07-12T14:00:18.29388 at org.elasticsearch.common.netty.buffer.CompositeChannelBuffer.toByteBuffer(CompositeChannelBuffer.java:649)
2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.buffer.AbstractChannelBuffer.toByteBuffer(AbstractChannelBuffer.java:530)
2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:77)
2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:46)
2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:194)
2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:152)
2013-07-12T14:00:18.29390 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:335)
2013-07-12T14:00:18.29390 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
2013-07-12T14:00:18.29391 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
2013-07-12T14:00:18.29392 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
2013-07-12T14:00:18.29392 at java.lang.Thread.run(Thread.java:722)

--
Cheers
Ralf

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

brian_yoder · July 19, 2013, 4:45pm

Ralf,

What is your size limit? Without any details, your description implies that
you are trying to return all of those 9.5M documents in one response.

You can use a scroll query with a relatively small size limit (say, 100 or
so). Then take the scroll ID from each response to feed back into the next
scan.

From my experience, match_all is fine. Just don't try to return the entire
9.5M documents in one response.
See Elasticsearch Platform — Find real-time answers at scale | Elastic for
ideas.

On Monday, July 15, 2013 7:22:27 AM UTC-4, Ralf Schmitt wrote:

Hi all,

I'm trying to scroll over all documents of an Elasticsearch Index using
a match_all query. I've set ES_HEAP_SIZE to 8G but I'm not able to
complete the operation, because elasticsearch runs out of memory (see
log at the end of this mail).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · July 21, 2013, 12:51pm

It would be nice to learn more about the source code how you scroll over
the index.

Jörg

On Fri, Jul 19, 2013 at 6:45 PM, InquiringMind brian.from.fl@gmail.comwrote:

Ralf,

What is your size limit? Without any details, your description implies
that you are trying to return all of those 9.5M documents in one response.

You can use a scroll query with a relatively small size limit (say, 100 or
so). Then take the scroll ID from each response to feed back into the next
scan.

From my experience, match_all is fine. Just don't try to return the entire
9.5M documents in one response. See
Elasticsearch Platform — Find real-time answers at scale | Elastic for ideas.

On Monday, July 15, 2013 7:22:27 AM UTC-4, Ralf Schmitt wrote:

Hi all,

I'm trying to scroll over all documents of an Elasticsearch Index using
a match_all query. I've set ES_HEAP_SIZE to 8G but I'm not able to
complete the operation, because elasticsearch runs out of memory (see
log at the end of this mail).

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ralf_Schmitt · July 21, 2013, 8:17pm

"joergprante@gmail.com" joergprante@gmail.com writes:

It would be nice to learn more about the source code how you scroll over
the index.

thanks, I've already opened an issue on github:

github.com/elastic/elasticsearch

memory leak while scrolling over index

opened 05:57AM - 16 Jul 13 UTC

closed 01:43PM - 17 Jul 13 UTC

schmir

(I've tried to report this via the google group. Either I'm being moderated or g…oogle doesn't like me. sorry for posting it twice) I'm trying to scroll over all documents of an ElasticSearch Index using a match_all query. I've set ES_HEAP_SIZE to 8G but I'm not able to complete the operation, because elasticsearch runs out of memory (see log at the end of this mail). The head plugin tells me the index size is around 400GB with around 9.5M documents. I'm using a single document type with the following mapping: <pre> { "src_doc": { "_all": { "enabled": false }, "_source": { "enabled": false }, "properties": { "content": { "type": "binary" }, "exception": { "index": "no", "store": true, "type": "string" }, "last_update": { "format": "YYYY-MM-dd", "store": true, "type": "date" }, "title": { "index": "no", "store": true, "type": "string" }, "uid": { "index": "no", "type": "string" }, "url": { "index": "no", "store": true, "type": "string" } } } } </pre> I'm using ElasticSearch 0.90.2, but the issue has already been in 0.90.0. The Cluster is a single node, which is not being used otherwise. Here's the log: <pre> 2013-07-12T13:59:39.40468 java.lang.OutOfMemoryError: Java heap space 2013-07-12T13:59:39.43523 Dumping heap to java_pid8908.hprof ... 2013-07-12T14:00:14.69555 Heap dump file created [8513498041 bytes in 35.278 secs] 2013-07-12T14:00:14.73700 [2013-07-12 16:00:14,702][WARN ][http.netty ] [graph.8908] Caught exception while handling client http traffic, closing connection [id: 0x8d823a3d, /127.0.0.1:52874 => /127.0.0.1:9250] 2013-07-12T14:00:14.73702 java.lang.OutOfMemoryError: Java heap space 2013-07-12T14:00:14.73702 at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) 2013-07-12T14:00:14.73702 at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) 2013-07-12T14:00:14.73702 at org.elasticsearch.common.netty.buffer.CompositeChannelBuffer.toByteBuffer(CompositeChannelBuffer.java:649) 2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.buffer.AbstractChannelBuffer.toByteBuffer(AbstractChannelBuffer.java:530) 2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:77) 2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:46) 2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:194) 2013-07-12T14:00:14.73703 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:152) 2013-07-12T14:00:14.73704 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:335) 2013-07-12T14:00:14.73704 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366) 2013-07-12T14:00:14.73705 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290) 2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) 2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) 2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) 2013-07-12T14:00:14.73706 at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) 2013-07-12T14:00:14.73706 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 2013-07-12T14:00:14.73707 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 2013-07-12T14:00:14.73707 at java.lang.Thread.run(Thread.java:722) 2013-07-12T14:00:18.29386 [2013-07-12 16:00:18,293][WARN ][http.netty ] [graph.8908] Caught exception while handling client http traffic, closing connection [id: 0xd2b513bc, /127.0.0.1:52875 => /127.0.0.1:9250] 2013-07-12T14:00:18.29388 java.lang.OutOfMemoryError: Java heap space 2013-07-12T14:00:18.29388 at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) 2013-07-12T14:00:18.29388 at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) 2013-07-12T14:00:18.29388 at org.elasticsearch.common.netty.buffer.CompositeChannelBuffer.toByteBuffer(CompositeChannelBuffer.java:649) 2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.buffer.AbstractChannelBuffer.toByteBuffer(AbstractChannelBuffer.java:530) 2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:77) 2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:46) 2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:194) 2013-07-12T14:00:18.29389 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:152) 2013-07-12T14:00:18.29390 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:335) 2013-07-12T14:00:18.29390 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366) 2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290) 2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) 2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) 2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) 2013-07-12T14:00:18.29391 at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) 2013-07-12T14:00:18.29391 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 2013-07-12T14:00:18.29392 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 2013-07-12T14:00:18.29392 at java.lang.Thread.run(Thread.java:722) </pre>

The issue is already closed and I'm waiting for an ES version that ships
with Lucene 4.4.

-- Cheers
Ralf

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Elastic Search Memory Alert Elasticsearch	5	786	April 10, 2019
Running out of heap memory Elasticsearch Elasticsearch	15	577	December 9, 2020
Elasticsearch (6.4.1) - JVM OutOfMemoryError Elasticsearch	5	1008	June 26, 2019
Huge memory leaks caused by ElasticSearch Elasticsearch	4	2405	July 5, 2017
High Memory usage even after setting heap memory size Elasticsearch	5	2875	August 18, 2017

Bug: memory leak while scrolling over index

Related topics