Elasticsearch Node failed to merge exception

Hi all,

we are getting below message on logs files also the shard getting unassigned.

Can any one help on this

[2016-07-06 22:55:53,133][WARN ][index.merge.scheduler ] [es-sv22-13] [stag-index][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
[2016-07-06 22:55:53,139][WARN ][index.engine.internal ] [es-sv22-13] [stag-index][0] failed engine [merge exception]
[2016-07-06 22:55:54,023][WARN ][cluster.action.shard ] [es-sv22-13] [stag-index][0] sending failed shard for [stag-index][0], node[vmF0ZU8lQdifAbU-Nvr3lQ], [P], s[STARTED], indexUUID [na], reason [engine failure, message [merge exception][MergeException[java.lang.OutOfMemoryError: Java heap space]; nested: OutOfMemoryError[Java heap space]; ]]

Hi,

from this little information its only clear that your node is running out of memory. Some context is necessary to dig deeper into this. What version of ES are you using? Whats the general cluster setup (number of nodes, indices, sharding strategy, memory available etc...). What is your use case (e.g. full text search?, doing huge aggregations? etc...). And if this happens frequently, can you maybe discover a pattern that is triggering the OOM that you can see in the log?
Please provide some more information on this so it is more likely people in the forum can help.

Hi @cbuescher,

Thanks for reply , below the details you asked
ES version: 1.3.7
our clusters running on 5 nodes 1 index 5 primary & 1 replica . all nodes having 2G heap.
Index size is nearly 20.6G .

Below the log may help for you:

[2016-07-07 10:03:38,265][WARN ][index.codec ] [es-sv22-17] [apptivodb6] no index mapper found for field: [taskStatuses.code] returning default postings format
[2016-07-07 10:03:38,266][WARN ][index.codec ] [es-sv22-17] [apptivodb6] no index mapper found for field: [taskStatuses.id] returning default postings format
[2016-07-07 10:03:38,266][WARN ][index.codec ] [es-sv22-17] [apptivodb6] no index mapper found for field: [taskStatuses.isEnabled] returning default postings format
[2016-07-07 10:03:38,266][WARN ][index.codec ] [es-sv22-17] [apptivodb6] no index mapper found for field: [taskStatuses.name] returning default postings format
[2016-07-07 10:03:39,170][WARN ][index.codec ] [es-sv22-17] [apptivodb6] no index mapper found for field: [uomCategories.uoms.description] returning default postings format
[2016-07-07 10:12:10,619][WARN ][index.merge.scheduler ] [es-sv22-17] [apptivodb6][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:251)
at org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:204)
at org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:176)
at org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:592)
at org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:248)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:133)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4225)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3820)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:106)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
[2016-07-07 10:12:10,644][WARN ][index.engine.internal ] [es-sv22-17] [apptivodb6][0] failed engine [merge exception]
[2016-07-07 10:12:10,970][WARN ][cluster.action.shard ] [es-sv22-17] [apptivodb6][0] sending failed shard for [apptivodb6][0], node[zH-M5sZLQYyGP38n22S43w], relocating [ou49VvIjTniN_mUkFDx0yg], [R], s[RELOCATING], indexUUID [na], reason [engine failure, message [merge exception][MergeException[java.lang.OutOfMemoryError: Java heap space]; nested: OutOfMemoryError[Java heap space]; ]]
[2016-07-07 18:05:47,152][DEBUG][action.search.type ] [es-sv22-17] [apptivodb6][2], node[HLg8WZF9Soaql9kZU7Cbqg], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@1bc90f4c]
org.elasticsearch.transport.RemoteTransportException: [es-sv22-16][inet[/10.80.3.106:19606]][search/phase/query+fetch]
Caused by: org.elasticsearch.search.query.QueryPhaseExecutionException: [apptivodb6][2]: query[ConstantScore(cache(_type:2))],from[0],size[20]: Query Failed [Failed to execute main query]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:162)
at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:335)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:751)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:740)
at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: field "description" was indexed without position data; cannot run PhraseQuery (term=test)
at org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:277)
at org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59)
at org.elasticsearch.common.lucene.docset.AndDocIdSet$IteratorBasedIterator.newDocIdSetIterator(AndDocIdSet.java:146)
at org.elasticsearch.common.lucene.docset.AndDocIdSet.iterator(AndDocIdSet.java:92)
at org.elasticsearch.common.lucene.docset.DocIdSets.toSafeBits(DocIdSets.java:100)
at org.elasticsearch.common.lucene.search.FilteredCollector.setNextReader(FilteredCollector.java:68)
.....

Hi All,
Again we are faced the same issue on one cluster,
Here with i have attached the graphs which shows the jvm fully used leads outofmemory in logs.

we are using 1.3.7 elasticsearch .
Heap allocated - 1.5G

log:

[2016-10-15 00:48:20,029][WARN ][index.merge.scheduler ] [node-1] [<index_name>][0] failed to merge
java.lang.OutOfMemoryError: Java heap space