OutOfMemoryError after index throttling

Bernhard_Berger · October 9, 2014, 8:19am

My application write bulk updates the whole time in an Elasticsearch
index (index size: ~200,000 docs, 35 MB, shards: 3*2; segment count ~35).
My cluster has 3 nodes with each 32 GB RAM, ES_HEAP_SIZE=16g,
Elasticsearch V. 1.3.4
I am using index.merge.scheduler.max_thread_count: 1 as I am using a
spinning hard disc.

Unfortunately I often get OutOfMemory errors on every node after merges
and I have to restart Elasticsearch to make any bulk requests again:

[12:17:49,716][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][1] now throttling indexing: numMergesInFlight=4,
maxNumMerges=3
[12:17:49,716][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][0] now throttling indexing: numMergesInFlight=4,
maxNumMerges=3
[12:17:49,719][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][0] stop throttling indexing: numMergesInFlight=2,
maxNumMerges=3
[12:17:49,727][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][1] stop throttling indexing: numMergesInFlight=2,
maxNumMerges=3
.... (+ 100s of log entries like this until this one:
[12:31:25,299][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][1] stop throttling indexing: numMergesInFlight=2,
maxNumMerges=3
[12:32:21,810][DEBUG][action.bulk ] [cluster1]
[v-2014-week41][0], node[02934K_ySZKEaQ3S1Hv9SA], [P], s[STARTED]:
Failed to execute [org.elasticsearch.action.bulk.BulkShardRequest@320ade50]
java.lang.OutOfMemoryError: PermGen space
[12:32:24,776][WARN ][action.bulk ] [cluster1] Failed to
send response for bulk/shard
java.lang.OutOfMemoryError: PermGen space
...

What can I do?
Should I increase ES_HEAP_SIZE?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54364518.9010505%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

mvg · October 9, 2014, 9:23am

How many index / delete or update requests are you bundeling in a single
bulk api call?

On 9 October 2014 10:19, Bernhard Berger bernhardberger3456@gmail.com
wrote:

My application write bulk updates the whole time in an Elasticsearch index
(index size: ~200,000 docs, 35 MB, shards: 3*2; segment count ~35).
My cluster has 3 nodes with each 32 GB RAM, ES_HEAP_SIZE=16g,
Elasticsearch V. 1.3.4
I am using index.merge.scheduler.max_thread_count: 1 as I am using a
spinning hard disc.

Unfortunately I often get OutOfMemory errors on every node after merges
and I have to restart Elasticsearch to make any bulk requests again:

[12:17:49,716][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][1] now throttling indexing: numMergesInFlight=4,
maxNumMerges=3
[12:17:49,716][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][0] now throttling indexing: numMergesInFlight=4,
maxNumMerges=3
[12:17:49,719][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][0] stop throttling indexing: numMergesInFlight=2,
maxNumMerges=3
[12:17:49,727][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][1] stop throttling indexing: numMergesInFlight=2,
maxNumMerges=3
.... (+ 100s of log entries like this until this one:
[12:31:25,299][INFO ][index.engine.internal ] [cluster1]
[v-2014-week41][1] stop throttling indexing: numMergesInFlight=2,
maxNumMerges=3
[12:32:21,810][DEBUG][action.bulk ] [cluster1]
[v-2014-week41][0], node[02934K_ySZKEaQ3S1Hv9SA], [P], s[STARTED]: Failed
to execute [org.elasticsearch.action.bulk.BulkShardRequest@320ade50]
java.lang.OutOfMemoryError: PermGen space
[12:32:24,776][WARN ][action.bulk ] [cluster1] Failed to send
response for bulk/shard
java.lang.OutOfMemoryError: PermGen space
...

What can I do?
Should I increase ES_HEAP_SIZE?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/54364518.9010505%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tzy6J7tb4ZmdDRPOVX-4FeYhe-9BS%2B98mUm3v4kcLAPng%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Bernhard_Berger · October 9, 2014, 9:33am

On 09.10.14 11:23, Martijn v Groningen wrote:

How many index / delete or update requests are you bundeling in a
single bulk api call?
I use a max bulk size of 1 MB (around 2000 docs/bulk); most of the
requests are updates with a small Groovy script which increase some
field values.

I will try now to update from Java 7 to Java 8 on the cluster, as there
is no PermGen space anymore in Java 8.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54365650.60705%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · October 9, 2014, 9:39am

Are you using the same Bulk object each time or a new instance for each iteration?

I have seen in the past in some Java code people using the same bulk object.
So the bulk started with 2k docs. Next iteration it was having 4k docs, … and so on.

Not sure it's your concern here but it worths checking

--
David Pilato | Technical Advocate | elasticsearch.com
david.pilato@elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 9 octobre 2014 à 11:33:07, Bernhard Berger (bernhardberger3456@gmail.com) a écrit:

On 09.10.14 11:23, Martijn v Groningen wrote:

How many index / delete or update requests are you bundeling in a
single bulk api call?
I use a max bulk size of 1 MB (around 2000 docs/bulk); most of the
requests are updates with a small Groovy script which increase some
field values.

I will try now to update from Java 7 to Java 8 on the cluster, as there
is no PermGen space anymore in Java 8.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54365650.60705%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.543657e5.3804823e.bc27%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Bernhard_Berger · October 9, 2014, 9:58am

On 09.10.14 11:39, David Pilato wrote:

Are you using the same Bulk object each time or a new instance for
each iteration?

I have seen in the past in some Java code people using the same bulk
object.
So the bulk started with 2k docs. Next iteration it was having 4k
docs, … and so on.

Not sure it's your concern here but it worths checking
Thanks, but that doesn't seem the problem (I have double checked it now
just to be sure ), I always create a new instance.

The problem happens sometimes after 3 OutOfMemory-free weeks, but also
sometimes just 1 day after the last Elasticsearch restart.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54365C5F.1050105%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · October 9, 2014, 10:38am

What gives Nodes Info output? Could you gist it?

--
David Pilato | Technical Advocate | elasticsearch.com
david.pilato@elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 9 octobre 2014 à 11:59:02, Bernhard Berger (bernhardberger3456@gmail.com) a écrit:

On 09.10.14 11:39, David Pilato wrote:
Are you using the same Bulk object each time or a new instance for each iteration?

I have seen in the past in some Java code people using the same bulk object.
So the bulk started with 2k docs. Next iteration it was having 4k docs, … and so on.

Not sure it's your concern here but it worths checking
Thanks, but that doesn't seem the problem (I have double checked it now just to be sure ), I always create a new instance.

The problem happens sometimes after 3 OutOfMemory-free weeks, but also sometimes just 1 day after the last Elasticsearch restart.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54365C5F.1050105%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.543665a5.1d4ed43b.bc27%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Bernhard_Berger · October 9, 2014, 10:52am

On 09.10.14 12:38, David Pilato wrote:

What gives Nodes Info output? Could you gist it?
Cluster1 Node info · GitHub
(Now running with Java 8 instead of Java 7)

Unfortunately it isn't possible to reach the node again after the
OutOfMemory error and get the actual node info.
Marvel doesn't show anything unusual for me before the OutOfMemoryError
(JVM mem<15%...).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/543668D0.1090902%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · October 9, 2014, 10:55am

Looks good. I was just checking that JVM memory settings have been taken into account.
When you restart your node and monitor nodes stats don't you see something strange?

Could you gist your node stats?

--
David Pilato | Technical Advocate | elasticsearch.com
david.pilato@elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 9 octobre 2014 à 12:52:04, Bernhard Berger (bernhardberger3456@gmail.com) a écrit:

On 09.10.14 12:38, David Pilato wrote:
What gives Nodes Info output? Could you gist it?

gist.github.com

https://gist.github.com/Hocdoc/42da3712d91ec976e39d

gistfile1.json5

{

    name: cluster1
    transport_address: inet[cluster1.xxx.com/x.x.xxx.41:9300]
    host: cluster1.xxx.com
    ip: xxx.xxx.xx.x
    version: 1.3.4
    build: a70f3cc
    http_address: inet[/x.x.xxx.xx:9200]
    settings: {

This file has been truncated. show original

(Now running with Java 8 instead of Java 7)

Unfortunately it isn't possible to reach the node again after the OutOfMemory error and get the actual node info.
Marvel doesn't show anything unusual for me before the OutOfMemoryError (JVM mem<15%...).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/543668D0.1090902%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.543669b7.684a481a.bc27%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Bernhard_Berger · October 9, 2014, 11:06am

On 09.10.14 12:55, David Pilato wrote:

Looks good. I was just checking that JVM memory settings have been
taken into account.
When you restart your node and monitor nodes stats don't you see
something strange?
Nop, just as a normal start.

Could you gist your node stats?
Cluster1 Node stats · GitHub

Thanks for your effort!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54366C31.8070403%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · October 9, 2014, 11:12am

I don't see anything obvious.
Did you compare with other nodes stats?

It sounds like you have enough memory for bulk operations.

May be you should try to reduce the bulk size and see how it goes?

--
David Pilato | Technical Advocate | elasticsearch.com
david.pilato@elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 9 octobre 2014 à 13:06:30, Bernhard Berger (bernhardberger3456@gmail.com) a écrit:

On 09.10.14 12:55, David Pilato wrote:
Looks good. I was just checking that JVM memory settings have been taken into account.
When you restart your node and monitor nodes stats don't you see something strange?
Nop, just as a normal start.

Could you gist your node stats?

gist.github.com

https://gist.github.com/Hocdoc/b605a5834a37cedf5f38

gistfile1.json5

{

    timestamp: 1412851384712
    name: cluster1
    transport_address: inet[cluster1.xxx.com/x.x.xxx.xx:9300]
    host: cluster1-.xxx.com
    ip: [
        inet[cluster1.xxx.com/x.x.xxx.xx:9300]
        NONE
    ]

This file has been truncated. show original

Thanks for your effort!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54366C31.8070403%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.54366d9b.3dc240fb.bc27%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Bernhard_Berger · October 9, 2014, 11:46am

On 09.10.14 13:12, David Pilato wrote:

I don't see anything obvious.
Did you compare with other nodes stats?

It sounds like you have enough memory for bulk operations.

May be you should try to reduce the bulk size and see how it goes?
I will try it with Java 8 and a reduced bulk size and will report here
next month if that fixed the problems.
Thanks for your work.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5436757D.6070208%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Adam_Cramer · October 9, 2014, 8:40pm

Hi There,

We're experiencing a similar issue after having run ES successfully for
several months without any major changes to our read/write patterns, data
sizes or documents. This is on Java 7 and ES 1.3.4.

Berhnard -- are you using scripting at all? The issue started popping up
after we switched our scripting from MVEL to Groovy.

Is there a chance that Groovy scripts would be blowing up the PermGen (but
not MVEL scripts)?

Adam

On Thursday, October 9, 2014 7:46:36 AM UTC-4, Bernhard Berger wrote:

On 09.10.14 13:12, David Pilato wrote:

I don't see anything obvious.
Did you compare with other nodes stats?

It sounds like you have enough memory for bulk operations.

May be you should try to reduce the bulk size and see how it goes?

I will try it with Java 8 and a reduced bulk size and will report here
next month if that fixed the problems.
Thanks for your work.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f0bf437e-1d9f-40ee-851c-cb93b4f38324%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bernhard_Berger · October 10, 2014, 6:21am

Am 09.10.2014 22:40, schrieb Adam Cramer:

We're experiencing a similar issue after having run ES successfully
for several months without any major changes to our read/write
patterns, data sizes or documents. This is on Java 7 and ES 1.3.4.

Berhnard -- are you using scripting at all? The issue started popping
up after we switched our scripting from MVEL to Groovy.
Yes, we also switched from MVEL to Groovy some months ago and the issue
started!
But we also changed a lot of other things in our code base, so I wasn't
sure about the cause.

Our scripts are very simple (and are nearly the same as in MVEL), just
some lines like: ctx._source.texts += text; ctx._source.state = state; ctx._source.number+=1;...

Since yesterday everything runs fine (half of the old bulk size, Java
8), but usually the memory problem appeared after about 1 week running a
node.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54377AEE.6090100%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

dakrone · October 13, 2014, 9:02am

There is an issue where Groovy uses more permgen space than MVEL did, we've
opened an issue to fix it here:

Until this is fixed, I believe putting the script on disk will have
Elasticsearch compile it only once and help reduce the permgen usage from
increasing, it can be referenced by name and still have parameters passed
in.

;; Lee

On Friday, October 10, 2014 8:21:44 AM UTC+2, Bernhard Berger wrote:

Am 09.10.2014 22:40, schrieb Adam Cramer:

We're experiencing a similar issue after having run ES successfully
for several months without any major changes to our read/write
patterns, data sizes or documents. This is on Java 7 and ES 1.3.4.

Berhnard -- are you using scripting at all? The issue started popping
up after we switched our scripting from MVEL to Groovy.
Yes, we also switched from MVEL to Groovy some months ago and the issue
started!
But we also changed a lot of other things in our code base, so I wasn't
sure about the cause.

Our scripts are very simple (and are nearly the same as in MVEL), just
some lines like: ctx._source.texts += text; ctx._source.state = state; ctx._source.number+=1;...

Since yesterday everything runs fine (half of the old bulk size, Java
8), but usually the memory problem appeared after about 1 week running a
node.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/45ceb77f-3664-488b-a91d-d6377b1f4506%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bernhard_Berger · October 13, 2014, 9:08am

On 13.10.14 11:02, Lee Hinman wrote:

There is an issue where Groovy uses more permgen space than MVEL did,
we've opened an issue to fix it here:

Reduce permgen use from Groovy scripts · Issue #7658 · elastic/elasticsearch · GitHub
Thanks for this helpful link!

Until this is fixed, I believe putting the script on disk will have
Elasticsearch compile it only once and help reduce the permgen usage
from increasing, it can be referenced by name and still have
parameters passed in.
Unfortunately not a good alternative for us, as we have to create the
scripts dynamically.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/543B967E.9040804%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

dakrone · October 13, 2014, 2:59pm

Bernhard Berger writes:

Unfortunately not a good alternative for us, as we have to create the
scripts dynamically.

Theres is a PR for this fix up:

It will hopefully be merged soon and be in the next release.

;; Lee

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/m2oatgj9q1.fsf%40writequit.org.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
OutOfMemoryError OOM while indexing Documents Elasticsearch	11	2199	July 5, 2017
OutOfMemory exception after few hours of indexing Elasticsearch	6	1901	July 6, 2017
Error "array index out of bounds java.lang.OutOfMemoryError: Java heap space" Elasticsearch	17	1885	July 6, 2017
Upgrade to ES 1.4 memory issues / tuning max_merged_segment Elasticsearch	2	413	July 6, 2017
Bulk crash Caused by: java.lang.OutOfMemoryError: Java heap space Elasticsearch	3	1417	May 31, 2017

OutOfMemoryError after index throttling

The problem happens sometimes after 3 OutOfMemory-free weeks, but also sometimes just 1 day after the last Elasticsearch restart.

Related topics