I've been performance testing a 3 node ES 1.7 cluster (16GB RAM/8GB Heap, 256GB SSD) with logging data - for example two indicies around 128GB each, 5 shards, no replicas.
As it's logging, I wanted to test out _optimize. If I try to optimize down to 1 segment I get an out of memory exception, in fact the smallest number I can go down to without out of memory error is 4. Does a segment have to fully fit in memory for _optimize (or forcemerge in 2.0+)? I've done some simple testing but can't confirm this on synthetic data.
Make sure you aren't calling a force merge on an index you're currently writing to. After you have finished indexing, then you should call _optimize on the index that is effectively read-only.
There are a great many reasons that a JVM can run into a java.lang.OutOfMemoryError, and many of them do not involve the heap being exhausted. Please read the detail of the message carefully and post it here.
[2016-02-16 13:32:09,088][DEBUG][action.admin.indices.optimize] [node1] [logstash-2015.02.01], node[a4HtxNaFRZ-7Yd7Vk-JcUQ], [P], s[STARTED]: failed to execute [OptimizeRequest{maxNumSegments=1, onlyExpungeDeletes=false, flush=true, focce=false}]
org.elasticsearch.transport.RemoteTransportException: [node2][inet[/192.168.0.5:9300]][indices:admin/optimize[s]]
Caused by: org.elasticsearch.index.engine.OptimizeFailedEngineException: [logstash-2015.02.01][2] force merge failed
org.elasticsearch.index.engine.InternalEngine.forceMerge(InternalEngine.java:829)
org.elasticsearch.index.engine.shard.IndexShard.optimize(IndexShard.java:734)
....
Caused by: java.lang.IllegalStateException: this writer hit an unrecoverable error; cannot complete forceMerge
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1819)
at org.apache.lucene.index.engine.InternalEngine.forceMerge(InternalEngine.java:817)
... 10 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.packed.Packed64SingleBlock.<init>(Packed64SingleBlock.java:53)
at org.apache.lucene.util.packed.Packed64SingleBlock$Packed64SingleBlock1.<init>(Packed64SingleBlock.java:256)
at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:221)
at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:211)
at org.apache.lucene.util.packed.PackedInts.getReaderNoHeader(PackedInts.java:784)
another time:
Caused by: java.io.IOException: background merge hit exception: _t6j(4.10.4):C11989155 _t6i(4.10.4):C7342178 _d8t(4.10.4):C5657347 into _t6k [maxNumSegments=1]
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1819)
at org.apache.lucene.index.engine.InternalEngine.forceMerge(InternalEngine.java:817)
... 10 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.packed.Packed64SingleBlock.<init>(Packed64SingleBlock.java:53)
at org.apache.lucene.util.packed.Packed64SingleBlock$Packed64SingleBlock4.<init>(Packed64SingleBlock.java:328)
at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:227)
at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:140)
at org.apache.lucene.util.packed.PackedInts.getReaderNoHeader(PackedInts.java:784)
at org.apache.lucene.codecs.lucene49.Lucene49NormsProducer.loadNorms(Lucene49NormsProducer.java:198)
...
another time:
Caused by: org.apache.lucene.index.MergePolicy$MergeAbortedException: merge is aborted: _t6j(4.10.4):C11989155 _t6i(4.10.4):C7342178 _d8t(4.10.4):C5657347 into _t6k [maxNumSegments=1] [ABORTED]
at org.apache.lucene.index.MergePolicy$OneMerge.checkAborted(MergePolicy.java:201)
at org.apache.lucene.index.MergeState$CheckAbort.work(MergeState.java:199)
at org.apache.lucene.codecs.DocValuesConsumer.mergeNumericField(DocValuesConsumer.java:129)
at org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:133)
...
Looking at the _segments API while it's merging, the memory_in_bytes of the segments in a shards is growing close to the size of my heap - does the optimize force all the norms to be loaded which could cause the out of memory error? This is an old dataset with lots of analyzed fields.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.