OutOfMemoryError OOM while indexing Documents


(Alexander Ott) #1

Hi,

we always run in an OutOfMemoryError while indexing documents or shortly
afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.awt.
headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+
HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/
elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/
elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32451][309] duration [5.1s], collections [1]/[5.9s], total
[5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32466][322] duration [5s], collections [1]/[5.9s], total
[5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ] [Stiletto]
[myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ] [Charcoal]
[gc][old][32895][86] duration [6.9s], collections [1]/[7.3s], total
[6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34750][166] duration [8s], collections [1]/[8.6s], total
[8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34812][182] duration [7.1s], collections [1]/[8.1s], total
[7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34844][193] duration [7.1s], collections [1]/[7.1s], total
[7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ] [Charcoal]
[myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9c958d28-159a-4464-8198-b54964a8bf3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Zachary Tong) #2

Are you running searches at the same time, or only indexing? Are you bulk
indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory buildup
but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are known
sigsegv bugs in the most recent versions of the JVM which have not been
fixed yet. It should be unrelated to your problem, but best to rule the
JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or shortly
afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.awt.
headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+
HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/
elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/
elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32451][309] duration [5.1s], collections [1]/[5.9s], total
[5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32466][322] duration [5s], collections [1]/[5.9s], total
[5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ] [Stiletto]
[myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ] [Charcoal]
[gc][old][32895][86] duration [6.9s], collections [1]/[7.3s], total
[6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34750][166] duration [8s], collections [1]/[8.6s], total
[8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34812][182] duration [7.1s], collections [1]/[8.1s], total
[7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34844][193] duration [7.1s], collections [1]/[7.1s], total
[7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ] [Charcoal]
[myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/42e5997c-fe03-441e-9e4a-bef92740ffc8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Ott) #3

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log... at this
time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are you bulk
indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory buildup
but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are known
sigsegv bugs in the most recent versions of the JVM which have not been
fixed yet. It should be unrelated to your problem, but best to rule the
JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or shortly
afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.awt.
headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX
:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/
elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share
/elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32451][309] duration [5.1s], collections [1]/[5.9s], total
[5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32466][322] duration [5s], collections [1]/[5.9s], total
[5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ] [Stiletto]
[myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ] [Charcoal]
[gc][old][32895][86] duration [6.9s], collections [1]/[7.3s], total
[6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34750][166] duration [8s], collections [1]/[8.6s], total
[8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34812][182] duration [7.1s], collections [1]/[8.1s], total
[7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34844][193] duration [7.1s], collections [1]/[7.1s], total
[7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ] [Charcoal]
[myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8341b9d6-9d9f-495b-9eec-323d1ecb150e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Zachary Tong) #4

Can you attach the full Node Stats and Node Info output? There were other
stats/metrics that I wanted to check (such as field data, bulk queue/size,
etc).

  • How large (physically, in kb/mb) are your bulk indexing requests?
    Bulks should be 5-15mb in size
  • How many concurrent bulks are you performing? Given you cluster size,
    a good number should probably be around 20-30
  • Are you distributing bulks evenly across the cluster?
  • I see that your heap is 32gb. How big are these machines?

-Zach

On Monday, March 17, 2014 5:33:30 AM UTC-4, Alexander Ott wrote:

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log... at this
time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are you
bulk indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory buildup
but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are
known sigsegv bugs in the most recent versions of the JVM which have not
been fixed yet. It should be unrelated to your problem, but best to rule
the JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or shortly
afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.awt.
headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX
:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/
elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/
share/elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/elasticsearch
/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32451][309] duration [5.1s], collections [1]/[5.9s], total
[5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32466][322] duration [5s], collections [1]/[5.9s], total
[5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ] [Stiletto]
[myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at
org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ] [Charcoal]
[gc][old][32895][86] duration [6.9s], collections [1]/[7.3s], total
[6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34750][166] duration [8s], collections [1]/[8.6s], total
[8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34812][182] duration [7.1s], collections [1]/[8.1s], total
[7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor]
[0b]->[0b]/[83.1mb]}{[old] [30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34844][193] duration [7.1s], collections [1]/[7.1s], total
[7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ] [Charcoal]
[myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f8e653b9-9dac-4168-9071-b7110531c895%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Ott) #5

At the moment i can provide only the jvm stats ... i will capture the other
stats as soon as possible.

We use 5-20 threads which will proccess bulks with a max size of 100
entries.
We only use one node/maschine for development so we have no cluster for
development...
The maschine has 64gb RAM and we increase the heap from 16gb to 32gb...

Am Montag, 17. März 2014 12:21:09 UTC+1 schrieb Zachary Tong:

Can you attach the full Node Stats and Node Info output? There were other
stats/metrics that I wanted to check (such as field data, bulk queue/size,
etc).

  • How large (physically, in kb/mb) are your bulk indexing requests?
    Bulks should be 5-15mb in size
  • How many concurrent bulks are you performing? Given you cluster
    size, a good number should probably be around 20-30
  • Are you distributing bulks evenly across the cluster?
  • I see that your heap is 32gb. How big are these machines?

-Zach

On Monday, March 17, 2014 5:33:30 AM UTC-4, Alexander Ott wrote:

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log... at
this time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are you
bulk indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory
buildup but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are
known sigsegv bugs in the most recent versions of the JVM which have not
been fixed yet. It should be unrelated to your problem, but best to rule
the JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or
shortly afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.awt
.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -
XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/
elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/
share/elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/
elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32451][309] duration [5.1s], collections [1]/[5.9s], total
[5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32466][322] duration [5s], collections [1]/[5.9s], total
[5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ] [Stiletto]
[myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at
org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ] [Charcoal]
[gc][old][32895][86] duration [6.9s], collections [1]/[7.3s], total
[6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34750][166] duration [8s], collections [1]/[8.6s], total
[8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34812][182] duration [7.1s], collections [1]/[8.1s], total
[7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor]
[0b]->[0b]/[83.1mb]}{[old] [30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34844][193] duration [7.1s], collections [1]/[7.1s], total
[7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ] [Charcoal]
[myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9b9d5e07-5a47-4f1f-ab6e-543a809c9bf4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Zachary Tong) #6

Ah, sorry, I misread your JVM stats dump (thought it was one long list,
instead of multiple calls to the same API). With a single node cluster, 20
concurrent bulks may be too many. Bulk requests have to sit in memory
while they are waiting to be processed, so it is possible to eat up your
heap with many pending bulk requests just hanging out, especially if they
are very large. I'll know more once I can see the Node Stats output.

More questions! :slight_smile:

  • How big are your documents on average?
  • Have you enabled any codecs and/or changed the posting_format of any
    fields in your document?
  • Are you using warmers?

On Monday, March 17, 2014 8:36:04 AM UTC-4, Alexander Ott wrote:

At the moment i can provide only the jvm stats ... i will capture the
other stats as soon as possible.

We use 5-20 threads which will proccess bulks with a max size of 100
entries.
We only use one node/maschine for development so we have no cluster for
development...
The maschine has 64gb RAM and we increase the heap from 16gb to 32gb...

Am Montag, 17. März 2014 12:21:09 UTC+1 schrieb Zachary Tong:

Can you attach the full Node Stats and Node Info output? There were
other stats/metrics that I wanted to check (such as field data, bulk
queue/size, etc).

  • How large (physically, in kb/mb) are your bulk indexing requests?
    Bulks should be 5-15mb in size
  • How many concurrent bulks are you performing? Given you cluster
    size, a good number should probably be around 20-30
  • Are you distributing bulks evenly across the cluster?
  • I see that your heap is 32gb. How big are these machines?

-Zach

On Monday, March 17, 2014 5:33:30 AM UTC-4, Alexander Ott wrote:

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log... at
this time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are you
bulk indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory
buildup but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are
known sigsegv bugs in the most recent versions of the JVM which have not
been fixed yet. It should be unrelated to your problem, but best to rule
the JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or
shortly afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.
awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -
XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/
elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/
share/elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/
elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32451][309] duration [5.1s], collections [1]/[5.9s], total
[5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ] [Stiletto]
[gc][old][32466][322] duration [5s], collections [1]/[5.9s], total
[5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ] [Stiletto]
[myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at
org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ] [Charcoal]
[gc][old][32895][86] duration [6.9s], collections [1]/[7.3s], total
[6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34750][166] duration [8s], collections [1]/[8.6s], total
[8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34812][182] duration [7.1s], collections [1]/[8.1s], total
[7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor]
[0b]->[0b]/[83.1mb]}{[old] [30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ] [Charcoal]
[gc][old][34844][193] duration [7.1s], collections [1]/[7.1s], total
[7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ] [Charcoal]
[myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/422f9142-94e9-446c-b01d-7a453df1f870%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Ott) #7

Attached the captured node stats and again the newest es_log.
I changed the garbage collector from UseParNewGC to UseG1GC with the result
that the OutOfMemoryError doesn't occur. But as you can see in the attached
es_log file the warnings of monitor.jvm are still present.

Am Montag, 17. März 2014 14:32:29 UTC+1 schrieb Zachary Tong:

Ah, sorry, I misread your JVM stats dump (thought it was one long list,
instead of multiple calls to the same API). With a single node cluster, 20
concurrent bulks may be too many. Bulk requests have to sit in memory
while they are waiting to be processed, so it is possible to eat up your
heap with many pending bulk requests just hanging out, especially if they
are very large. I'll know more once I can see the Node Stats output.

More questions! :slight_smile:

  • How big are your documents on average?
  • Have you enabled any codecs and/or changed the posting_format of
    any fields in your document?
  • Are you using warmers?

On Monday, March 17, 2014 8:36:04 AM UTC-4, Alexander Ott wrote:

At the moment i can provide only the jvm stats ... i will capture the
other stats as soon as possible.

We use 5-20 threads which will proccess bulks with a max size of 100
entries.
We only use one node/maschine for development so we have no cluster for
development...
The maschine has 64gb RAM and we increase the heap from 16gb to 32gb...

Am Montag, 17. März 2014 12:21:09 UTC+1 schrieb Zachary Tong:

Can you attach the full Node Stats and Node Info output? There were
other stats/metrics that I wanted to check (such as field data, bulk
queue/size, etc).

  • How large (physically, in kb/mb) are your bulk indexing requests?
    Bulks should be 5-15mb in size
  • How many concurrent bulks are you performing? Given you cluster
    size, a good number should probably be around 20-30
  • Are you distributing bulks evenly across the cluster?
  • I see that your heap is 32gb. How big are these machines?

-Zach

On Monday, March 17, 2014 5:33:30 AM UTC-4, Alexander Ott wrote:

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log... at
this time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are you
bulk indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory
buildup but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are
known sigsegv bugs in the most recent versions of the JVM which have not
been fixed yet. It should be unrelated to your problem, but best to rule
the JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or
shortly afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.
awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run
/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/
share/elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/
elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32451][309] duration [5.1s], collections [1]/[5.9s],
total [5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32466][322] duration [5s], collections [1]/[5.9s],
total [5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ]
[Stiletto] [myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at
org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ]
[Charcoal] [gc][old][32895][86] duration [6.9s], collections [1]/[7.3s],
total [6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34750][166] duration [8s], collections [1]/[8.6s],
total [8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old]
[30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34812][182] duration [7.1s], collections [1]/[8.1s],
total [7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor]
[0b]->[0b]/[83.1mb]}{[old] [30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34844][193] duration [7.1s], collections [1]/[7.1s],
total [7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ]
[Charcoal] [myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4d16607a-0eea-49d0-b0b5-c7caa8f14066%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Zachary Tong) #8

My observations from your Node Stats

  • Your node tends to have around 20-25 merges happening at any given
    time. The default max is 10...have you changed any of the merge policy
    settings? Can you attach your elasticsearch.yml?
  • At one point, your segments were using 24gb of the heap (due to
    associated memory structures like bloom filter, etc). How many primary
    shards are in your index?
  • Your bulks look mostly ok, but you are getting rejections. I'd slow
    the bulk loader down a little bit (rejections mean ES is overloaded)

If you can take a heap dump, I would be willing to load it up and look
through the allocated objects. That would be the fastest way to identify
what is eating your heap and start to work on why. To take a heap dump run
this, zip it up and save somewhere: jmap -dump:format=b,file=dump.bin

As an aside, it's hard to help debug when you don't answer all of the
questions I've asked :stuck_out_tongue:

Unanswered questions from upthread:

  • Have you enabled any codecs and/or changed the posting_format of any
    fields in your document?
  • curl -XGET 'localhost:9200/_nodes/'

Hope we can get this sorted for you soon!
-Zach

On Tuesday, March 18, 2014 5:29:40 AM UTC-4, Alexander Ott wrote:

Attached the captured node stats and again the newest es_log.
I changed the garbage collector from UseParNewGC to UseG1GC with the
result that the OutOfMemoryError doesn't occur. But as you can see in the
attached es_log file the warnings of monitor.jvm are still present.

Am Montag, 17. März 2014 14:32:29 UTC+1 schrieb Zachary Tong:

Ah, sorry, I misread your JVM stats dump (thought it was one long list,
instead of multiple calls to the same API). With a single node cluster, 20
concurrent bulks may be too many. Bulk requests have to sit in memory
while they are waiting to be processed, so it is possible to eat up your
heap with many pending bulk requests just hanging out, especially if they
are very large. I'll know more once I can see the Node Stats output.

More questions! :slight_smile:

  • How big are your documents on average?
  • Have you enabled any codecs and/or changed the posting_format of
    any fields in your document?
  • Are you using warmers?

On Monday, March 17, 2014 8:36:04 AM UTC-4, Alexander Ott wrote:

At the moment i can provide only the jvm stats ... i will capture the
other stats as soon as possible.

We use 5-20 threads which will proccess bulks with a max size of 100
entries.
We only use one node/maschine for development so we have no cluster for
development...
The maschine has 64gb RAM and we increase the heap from 16gb to 32gb...

Am Montag, 17. März 2014 12:21:09 UTC+1 schrieb Zachary Tong:

Can you attach the full Node Stats and Node Info output? There were
other stats/metrics that I wanted to check (such as field data, bulk
queue/size, etc).

  • How large (physically, in kb/mb) are your bulk indexing requests?
    Bulks should be 5-15mb in size
  • How many concurrent bulks are you performing? Given you cluster
    size, a good number should probably be around 20-30
  • Are you distributing bulks evenly across the cluster?
  • I see that your heap is 32gb. How big are these machines?

-Zach

On Monday, March 17, 2014 5:33:30 AM UTC-4, Alexander Ott wrote:

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log... at
this time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are you
bulk indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory
buildup but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are
known sigsegv bugs in the most recent versions of the JVM which have not
been fixed yet. It should be unrelated to your problem, but best to rule
the JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or
shortly afterwards.
We only have one instance of elasticsearch version 1.0.1 (no cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava.
awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/
run/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :
/usr/share/elasticsearch/lib/elasticsearch-1.0.1.jar:/usr/share/
elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32451][309] duration [5.1s], collections [1]/[5.9s],
total [5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32466][322] duration [5s], collections [1]/[5.9s],
total [5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ]
[Stiletto] [myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at
org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at
org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ]
[Charcoal] [gc][old][32895][86] duration [6.9s], collections [1]/[7.3s],
total [6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34750][166] duration [8s], collections [1]/[8.6s],
total [8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34812][182] duration [7.1s], collections [1]/[8.1s],
total [7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor]
[0b]->[0b]/[83.1mb]}{[old] [30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34844][193] duration [7.1s], collections [1]/[7.1s],
total [7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ]
[Charcoal] [myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c041bde6-ae65-40b8-be7d-9e7b1553fe5d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Ott) #9

Attached the elasticsearch.yml and the curl -XGET 'localhost:9200/_nodes/'
We have 5 shard per index and we have not enabled any codecs.

Which size do you prefer as bulk size and how many threads can/should
process bulks at the same time? As you can see in _nodes.txt we have 12
available processors...
Should we may slow down bulk loader with adding a wait of a few seconds?

Am Dienstag, 18. März 2014 13:22:57 UTC+1 schrieb Zachary Tong:

My observations from your Node Stats

  • Your node tends to have around 20-25 merges happening at any given
    time. The default max is 10...have you changed any of the merge policy
    settings? Can you attach your elasticsearch.yml?
  • At one point, your segments were using 24gb of the heap (due to
    associated memory structures like bloom filter, etc). How many primary
    shards are in your index?
  • Your bulks look mostly ok, but you are getting rejections. I'd slow
    the bulk loader down a little bit (rejections mean ES is overloaded)

If you can take a heap dump, I would be willing to load it up and look
through the allocated objects. That would be the fastest way to identify
what is eating your heap and start to work on why. To take a heap dump run
this, zip it up and save somewhere: jmap -dump:format=b,file=dump.bin

As an aside, it's hard to help debug when you don't answer all of the
questions I've asked :stuck_out_tongue:

Unanswered questions from upthread:

  • Have you enabled any codecs and/or changed the posting_format of
    any fields in your document?
  • curl -XGET 'localhost:9200/_nodes/'

Hope we can get this sorted for you soon!
-Zach

On Tuesday, March 18, 2014 5:29:40 AM UTC-4, Alexander Ott wrote:

Attached the captured node stats and again the newest es_log.
I changed the garbage collector from UseParNewGC to UseG1GC with the
result that the OutOfMemoryError doesn't occur. But as you can see in the
attached es_log file the warnings of monitor.jvm are still present.

Am Montag, 17. März 2014 14:32:29 UTC+1 schrieb Zachary Tong:

Ah, sorry, I misread your JVM stats dump (thought it was one long list,
instead of multiple calls to the same API). With a single node cluster, 20
concurrent bulks may be too many. Bulk requests have to sit in memory
while they are waiting to be processed, so it is possible to eat up your
heap with many pending bulk requests just hanging out, especially if they
are very large. I'll know more once I can see the Node Stats output.

More questions! :slight_smile:

  • How big are your documents on average?
  • Have you enabled any codecs and/or changed the posting_format of
    any fields in your document?
  • Are you using warmers?

On Monday, March 17, 2014 8:36:04 AM UTC-4, Alexander Ott wrote:

At the moment i can provide only the jvm stats ... i will capture the
other stats as soon as possible.

We use 5-20 threads which will proccess bulks with a max size of 100
entries.
We only use one node/maschine for development so we have no cluster for
development...
The maschine has 64gb RAM and we increase the heap from 16gb to 32gb...

Am Montag, 17. März 2014 12:21:09 UTC+1 schrieb Zachary Tong:

Can you attach the full Node Stats and Node Info output? There were
other stats/metrics that I wanted to check (such as field data, bulk
queue/size, etc).

  • How large (physically, in kb/mb) are your bulk indexing
    requests? Bulks should be 5-15mb in size
  • How many concurrent bulks are you performing? Given you cluster
    size, a good number should probably be around 20-30
  • Are you distributing bulks evenly across the cluster?
  • I see that your heap is 32gb. How big are these machines?

-Zach

On Monday, March 17, 2014 5:33:30 AM UTC-4, Alexander Ott wrote:

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log... at
this time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are
you bulk indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory
buildup but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There are
known sigsegv bugs in the most recent versions of the JVM which have not
been fixed yet. It should be unrelated to your problem, but best to rule
the JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or
shortly afterwards.
We only have one instance of elasticsearch version 1.0.1 (no
cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -Djava
.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:
CMSInitiatingOccupancyFraction=75 -XX:+
UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -
Delasticsearch -Des.pidfile=/var/run/elasticsearch.pid -Des.path.
home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/
elasticsearch-1.0.1.jar:/usr/share/elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32451][309] duration [5.1s], collections [1]/[5.9s],
total [5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32466][322] duration [5s], collections [1]/[5.9s],
total [5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ]
[Stiletto] [myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at
org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at
org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ]
[Charcoal] [gc][old][32895][86] duration [6.9s], collections [1]/[7.3s],
total [6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34750][166] duration [8s], collections [1]/[8.6s],
total [8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34812][182] duration [7.1s], collections [1]/[8.1s],
total [7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor]
[0b]->[0b]/[83.1mb]}{[old] [30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34844][193] duration [7.1s], collections [1]/[7.1s],
total [7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ]
[Charcoal] [myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean up?
shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8de9ff89-7328-4434-9438-81c26586bde8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Zachary Tong) #10

Thanks for the rest of the info, that helps rule out a couple of
possibilities. Unfortunately, I was hoping you had fiddled with the merge
settings and it was causing problems...but it looks like everything is
default (which is good!). Back to the drawing board

Would it be possible to get a heapdump and store it somewhere I can access
it? At this point, I think that is our best chance of debugging this
problem.

Which size do you prefer as bulk size and how many threads can/should

process bulks at the same time? As you can see in _nodes.txt we have 12
available processors...
Should we may slow down bulk loader with adding a wait of a few seconds?

Bulks tend to be most efficient around 5-15mb in size. For your machine, I
would start with 12 concurrent threads and slowly increase from there. If
you start running into rejections from ES, that's the point where you stop
increasing threads because you've filled the bulk queue with pending tasks
and ES cannot keep up anymore. With rejections, the general pattern is to
wait a random time (1-5s) and then retry all the rejected actions in a new,
smaller bulk.

-Zach

On Tuesday, March 18, 2014 8:44:00 AM UTC-4, Alexander Ott wrote:

Attached the elasticsearch.yml and the curl -XGET 'localhost:9200/_nodes/'
We have 5 shard per index and we have not enabled any codecs.

Which size do you prefer as bulk size and how many threads can/should
process bulks at the same time? As you can see in _nodes.txt we have 12
available processors...
Should we may slow down bulk loader with adding a wait of a few seconds?

Am Dienstag, 18. März 2014 13:22:57 UTC+1 schrieb Zachary Tong:

My observations from your Node Stats

  • Your node tends to have around 20-25 merges happening at any given
    time. The default max is 10...have you changed any of the merge policy
    settings? Can you attach your elasticsearch.yml?
  • At one point, your segments were using 24gb of the heap (due to
    associated memory structures like bloom filter, etc). How many primary
    shards are in your index?
  • Your bulks look mostly ok, but you are getting rejections. I'd
    slow the bulk loader down a little bit (rejections mean ES is overloaded)

If you can take a heap dump, I would be willing to load it up and look
through the allocated objects. That would be the fastest way to identify
what is eating your heap and start to work on why. To take a heap dump run
this, zip it up and save somewhere: jmap -dump:format=b,file=dump.bin

As an aside, it's hard to help debug when you don't answer all of the
questions I've asked :stuck_out_tongue:

Unanswered questions from upthread:

  • Have you enabled any codecs and/or changed the posting_format of
    any fields in your document?
  • curl -XGET 'localhost:9200/_nodes/'

Hope we can get this sorted for you soon!
-Zach

On Tuesday, March 18, 2014 5:29:40 AM UTC-4, Alexander Ott wrote:

Attached the captured node stats and again the newest es_log.
I changed the garbage collector from UseParNewGC to UseG1GC with the
result that the OutOfMemoryError doesn't occur. But as you can see in the
attached es_log file the warnings of monitor.jvm are still present.

Am Montag, 17. März 2014 14:32:29 UTC+1 schrieb Zachary Tong:

Ah, sorry, I misread your JVM stats dump (thought it was one long list,
instead of multiple calls to the same API). With a single node cluster, 20
concurrent bulks may be too many. Bulk requests have to sit in memory
while they are waiting to be processed, so it is possible to eat up your
heap with many pending bulk requests just hanging out, especially if they
are very large. I'll know more once I can see the Node Stats output.

More questions! :slight_smile:

  • How big are your documents on average?
  • Have you enabled any codecs and/or changed the posting_format
    of any fields in your document?
  • Are you using warmers?

On Monday, March 17, 2014 8:36:04 AM UTC-4, Alexander Ott wrote:

At the moment i can provide only the jvm stats ... i will capture the
other stats as soon as possible.

We use 5-20 threads which will proccess bulks with a max size of 100
entries.
We only use one node/maschine for development so we have no cluster
for development...
The maschine has 64gb RAM and we increase the heap from 16gb to 32gb...

Am Montag, 17. März 2014 12:21:09 UTC+1 schrieb Zachary Tong:

Can you attach the full Node Stats and Node Info output? There were
other stats/metrics that I wanted to check (such as field data, bulk
queue/size, etc).

  • How large (physically, in kb/mb) are your bulk indexing
    requests? Bulks should be 5-15mb in size
  • How many concurrent bulks are you performing? Given you
    cluster size, a good number should probably be around 20-30
  • Are you distributing bulks evenly across the cluster?
  • I see that your heap is 32gb. How big are these machines?

-Zach

On Monday, March 17, 2014 5:33:30 AM UTC-4, Alexander Ott wrote:

Hi,

attatched you can find the es_log and the captured node jvm stats.
We are only indexing at this time and we use bulk requests.

As you can see at log entry "2014-03-14 21:18:59,873" in es_log...
at this time our indexing process finished and afterwards the OOM occurs...

Am Freitag, 14. März 2014 14:47:14 UTC+1 schrieb Zachary Tong:

Are you running searches at the same time, or only indexing? Are
you bulk indexing? How big (in physical kb/mb) are your bulk requests?

Can you attach the output of these APIs (preferably during memory
buildup but before the OOM):

  • curl -XGET 'localhost:9200/_nodes/'
  • curl -XGET 'localhost:9200/_nodes/stats'

I would recommend downgrading your JVM to Java 1.7.0_u25. There
are known sigsegv bugs in the most recent versions of the JVM which have
not been fixed yet. It should be unrelated to your problem, but best to
rule the JVM out.

I would not touch any of those configs. In general, when debugging
problems it is best to restore as many of the configs to their default
settings as possible.

On Friday, March 14, 2014 5:46:12 AM UTC-4, Alexander Ott wrote:

Hi,

we always run in an OutOfMemoryError while indexing documents or
shortly afterwards.
We only have one instance of elasticsearch version 1.0.1 (no
cluster)

Index informations:
size: 203G (203G)
docs: 237.354.313 (237.354.313)

Our JVM settings as following:

/usr/lib/jvm/java-7-oracle/bin/java -Xms16g -Xmx16g -Xss256k -
Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -
XX:CMSInitiatingOccupancyFraction=75 -XX:+
UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -
Delasticsearch -Des.pidfile=/var/run/elasticsearch.pid -Des.path.
home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/
elasticsearch-1.0.1.jar:/usr/share/elasticsearch/lib/:/usr/share/elasticsearch/lib/sigar/
-Des.default.config=/etc/elasticsearch/elasticsearch.yml
-Des.default.path.home=/usr/share/elasticsearch
-Des.default.path.logs=/var/log/elasticsearch
-Des.default.path.data=/var/lib/elasticsearch
-Des.default.path.work=/tmp/elasticsearch
-Des.default.path.conf=/etc/elasticsearch
org.elasticsearch.bootstrap.Elasticsearch

OutOfMemoryError:
[2014-03-12 01:27:27,964][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32451][309] duration [5.1s], collections [1]/[5.9s],
total [5.1s]/[3.1m], memory [15.8gb]->[15.7gb]/[15.9gb], all_pools {[young]
[665.6mb]->[583.7mb]/[665.6mb]}{[survivor] [32.9mb]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:28:23,822][INFO ][monitor.jvm ]
[Stiletto] [gc][old][32466][322] duration [5s], collections [1]/[5.9s],
total [5s]/[3.8m], memory [15.8gb]->[15.8gb]/[15.9gb], all_pools {[young]
[652.5mb]->[663.8mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old]
[15.1gb]->[15.1gb]/[15.1gb]}
[2014-03-12 01:33:29,814][WARN ][index.merge.scheduler ]
[Stiletto] [myIndex][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.(FST.java:282)
at
org.apache.lucene.util.fst.Builder.(Builder.java:163)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:420)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:569)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:544)
at
org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1000)
at
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
at
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:383)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

We also increased heap to 32g but with the same result
[2014-03-12 22:39:53,817][INFO ][monitor.jvm ]
[Charcoal] [gc][old][32895][86] duration [6.9s], collections [1]/[7.3s],
total [6.9s]/[19.6s], memory [20.5gb]->[12.7gb]/[31.9gb], all_pools {[youn
g] [654.9mb]->[1.9mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [19.8gb]->[12.7gb]/[31.1gb]}
[2014-03-12 23:11:07,015][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34750][166] duration [8s], collections [1]/[8.6s],
total [8s]/[29.1s], memory [30.9gb]->[30.9gb]/[31.9gb], all_pools {[young]
[660.6mb]->[1mb]/[665.6mb]}{[survivor]
[83.1mb]->[0b]/[83.1mb]}{[old] [30.2gb]->[30.9gb]/[31.1gb]}
[2014-03-12 23:12:18,117][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34812][182] duration [7.1s], collections [1]/[8.1s],
total [7.1s]/[36.6s], memory [31.5gb]->[31.5gb]/[31.9gb], all_pools {[you
ng] [655.6mb]->[410.3mb]/[665.6mb]}{[survivor]
[0b]->[0b]/[83.1mb]}{[old] [30.9gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:12:56,294][INFO ][monitor.jvm ]
[Charcoal] [gc][old][34844][193] duration [7.1s], collections [1]/[7.1s],
total [7.1s]/[43.9s], memory [31.9gb]->[31.9gb]/[31.9gb], all_pools {[you
ng] [665.6mb]->[665.2mb]/[665.6mb]}{[survivor]
[81.9mb]->[82.8mb]/[83.1mb]}{[old] [31.1gb]->[31.1gb]/[31.1gb]}
[2014-03-12 23:13:11,836][WARN ][index.merge.scheduler ]
[Charcoal] [myIndex][3] failed to merge
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:228)
at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:188)
at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:159)
at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:516)
at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:232)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4071)
at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)

*java version: *
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Elasticsearch.yml: Settings which may should enabled?
#indices.memory.index_buffer_size: 40%
#indices.store.throttle.type: merge
#indices.store.throttle.max_bytes_per_sec: 50mb
#index.refresh_interval: 2s
#index.fielddata.cache: soft
#index.store.type: mmapfs
#index.fielddata.cache.size: 20%

Any ideas how to solve this problem? Why old gen won't be clean
up? shouldn't it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/66754aee-0b34-4b03-be18-d0aba1781b0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Arfeen Khan) #11

Hi there,

Is it resolved. Seems to have similar issue.

Thank you.


(system) #12