We are using a small elasticsearch cluster of three nodes, version 1.0.1.
Each node has 7 GB RAM. Our software creates daily indexes for storing it's
data. Daily index is something around 5 GB. Unfortunately, for a reason,
Elasticsearch eats up all RAM and hangs the node, even though heap size is
set to 6 GB max. So we decided to use monit to restart it on reaching
memory limit of 90%. It works, but sometimes we got such errors:
[2014-03-22 16:56:04,943][DEBUG][action.search.type ] [es-00]
[product-22-03-2014][0], node[jbUDVzuvS5GTM7iOG8iwzQ], [P], s[STARTED]:
Failed to execute [org.elasticsearch.action.search.SearchRequest@687dc039]
org.elasticsearch.search.fetch.FetchPhaseExecutionException:
[product-22-03-2014][0]: query[filtered(ToParentBlockJoinQuery
(filtered(history.created:[1392574921000 TO
*])->cache(_type:__history)))->cache(_type:product)],from[0],size[1000],sort[<custom:"history.created":
org.elasticsearch.index.search.nested.NestedFieldComparatorSource@15e4ece9>]:
Fetch Failed [Failed to fetch doc id [7263214]]
at
org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:230)
at
org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:156)
at
org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:332)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:304)
at
org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryAndFetchAction.java:71)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4.run(TransportSearchTypeAction.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException: seek past EOF:
MMapIndexInput(path="/opt/elasticsearch/main/nodes/0/indices/product-22-03-2014/0/index/_9lz.fdt")
at
org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:174)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:229)
at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:276)
at
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
at
org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:196)
at
org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:228)
... 9 more
[2014-03-22 16:56:04,944][DEBUG][action.search.type ] [es-00] All
shards failed for phase: [query_fetch]
According to our logs, this might happen when one or two nodes get
restarted. More strangely, same shard got corrupted on all nodes of
cluster. Why could this happen? How can we fix it? Can you suggest us how
to fix memory usage?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9e40c2a4-6a76-454d-a96b-483cdbf3e946%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.