OutOfMemory Error occurred When Flush shard

xiaoliang_tian · March 2, 2015, 8:01am

The Version is 0.90.13

There is Out of memory error occurred when flushing shard

How could I resolve this error?

Any suggestion to prevent it.

Below is the cluster situation:
5 data node,1 master and 1 search node

Dozens of index and each index has more than 100GB data

Another problem is When someone try to query data there is connect
time-out problem,what could cause time out? I think concurrency is ort of
considered,Maybe due to the huge data?

Plz help

Below is OOM error
[2015-03-01 07:23:24,023][WARN ][index.translog ] [Outlaw]
[16494][4] failed to flush shard on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: [16494][4] Flush
failed
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:907)
at
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:563)
at
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4354)
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2891)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2954)
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:893)
... 5 more
[2015-03-01 07:23:27,078][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [engine failure, message
[OutOfMemoryError[Java heap space]]]
[2015-03-01 07:23:24,030][DEBUG][action.bulk ] [Outlaw]
[16494][4] failed to execute bulk item (index) index
{[16494][cs-us-east-1-logging-swc-rel][c22f6222-c146-4608-9dcc-c8846191c21a],
source[{"version":"0.2","role":"es-data","from":"cs-us-east-1-logging-swc-rel","host":"ip-10-1-33-94-us-east-1-compute-internal","type":"log","time":1425092107803,"level":"system","text":"
27 disks \n 2 partitions \n 47584725 total reads\n
80364 merged reads\n 3159251537 read sectors\n 586297450 milli
reading\n 634621170 writes\n 17059531 merged writes\n 49307983928
written sectors\n 2768108439 milli writing\n 0 inprogress IO\n
426354 milli spent
IO\n","state":"info","service":"snapshot","process":"VMstat","uid":"c22f6222-c146-4608-9dcc-c8846191c21a"}]}
org.elasticsearch.index.engine.IndexFailedEngineException: [16494][4] Index
failed for
[cs-us-east-1-logging-swc-rel#c22f6222-c146-4608-9dcc-c8846191c21a]
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:501)
at
org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:386)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:398)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:156)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
[2015-03-01 07:23:30,699][DEBUG][action.bulk ] [Outlaw]
[16494][4], node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED]: Failed to
execute [org.elasticsearch.action.bulk.BulkShardRequest@7fca9100]
java.lang.NullPointerException
at
org.elasticsearch.action.bulk.TransportShardBulkAction.applyVersion(TransportShardBulkAction.java:640)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:178)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-03-01 07:23:30,719][WARN ][indices.cluster ] [Outlaw]
[16494][4] master
[[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true}] marked shard as
started, but shard has not been created, mark shard as failed
[2015-03-01 07:23:30,728][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [master
[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true} marked shard as
started, but shard has not been created, mark shard as failed]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/06d33d9c-b684-4bd6-87f2-671b0704c763%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

xiaoliang_tian · March 2, 2015, 8:19am

About the time out error I found error log below,I don't know what that mean

[2015-02-27 21:59:41,575][DEBUG][action.search.type ] [Selene]
[487030] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Red
Wolf][inet[/10.1.33.77:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No
search context found for id [487030]
at
org.elasticsearch.search.SearchService.findContext(SearchService.java:460)
at
org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:433)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:728)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:717)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-02-27 21:59:41,576][DEBUG][action.search.type ] [Selene]
[475699] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException:
[S'byll][inet[/10.1.33.94:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No
search context found for id [475699]
at
org.elasticsearch.search.SearchService.findContext(SearchService.java:460)
at
org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:433)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:728)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:717)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-02-27 21:59:41,575][DEBUG][action.search.type ] [Selene]
[487026] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Red
Wolf][inet[/10.1.33.77:9300]][search/phase/fetch/id]
Caused by: org.elasticsearch.search.SearchContextMissingException: No
search context found for id [487026]
at
org.elasticsearch.search.SearchService.findContext(SearchService.java:460)
at
org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:433)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:728)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:717)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-02-27 21:59:41,575][DEBUG][action.search.type ] [Selene]
[460892] Failed to execute fetch

在 2015年3月2日星期一 UTC+8下午4:01:48，xiaoliang tian写道：

The Version is 0.90.13

There is Out of memory error occurred when flushing shard

How could I resolve this error?

Any suggestion to prevent it.

Below is the cluster situation:
5 data node,1 master and 1 search node

Dozens of index and each index has more than 100GB data

Another problem is When someone try to query data there is connect
time-out problem,what could cause time out? I think concurrency is ort of
considered,Maybe due to the huge data?

Plz help

Below is OOM error
[2015-03-01 07:23:24,023][WARN ][index.translog ] [Outlaw]
[16494][4] failed to flush shard on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: [16494][4]
Flush failed
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:907)
at
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:563)
at
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4354)
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2891)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2954)
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:893)
... 5 more
[2015-03-01 07:23:27,078][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [engine failure, message
[OutOfMemoryError[Java heap space]]]
[2015-03-01 07:23:24,030][DEBUG][action.bulk ] [Outlaw]
[16494][4] failed to execute bulk item (index) index
{[16494][cs-us-east-1-logging-swc-rel][c22f6222-c146-4608-9dcc-c8846191c21a],
source[{"version":"0.2","role":"es-data","from":"cs-us-east-1-logging-swc-rel","host":"ip-10-1-33-94-us-east-1-compute-internal","type":"log","time":1425092107803,"level":"system","text":"
27 disks \n 2 partitions \n 47584725 total reads\n
80364 merged reads\n 3159251537 read sectors\n 586297450 milli
reading\n 634621170 writes\n 17059531 merged writes\n 49307983928
written sectors\n 2768108439 milli writing\n 0 inprogress IO\n
426354 milli spent
IO\n","state":"info","service":"snapshot","process":"VMstat","uid":"c22f6222-c146-4608-9dcc-c8846191c21a"}]}
org.elasticsearch.index.engine.IndexFailedEngineException: [16494][4]
Index failed for
[cs-us-east-1-logging-swc-rel#c22f6222-c146-4608-9dcc-c8846191c21a]
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:501)
at
org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:386)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:398)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:156)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
[2015-03-01 07:23:30,699][DEBUG][action.bulk ] [Outlaw]
[16494][4], node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED]: Failed to
execute [org.elasticsearch.action.bulk.BulkShardRequest@7fca9100]
java.lang.NullPointerException
at
org.elasticsearch.action.bulk.TransportShardBulkAction.applyVersion(TransportShardBulkAction.java:640)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:178)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-03-01 07:23:30,719][WARN ][indices.cluster ] [Outlaw]
[16494][4] master
[[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true}] marked shard as
started, but shard has not been created, mark shard as failed
[2015-03-01 07:23:30,728][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [master
[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true} marked shard as
started, but shard has not been created, mark shard as failed]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d867c194-24a6-49ef-a08d-f579e9b5e457%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · March 2, 2015, 8:36am

You should really upgrade!

If you can, try deleting/closing some indices to reduce the load, also
given it's <1.X you can try disabling bloom filters.

On 2 March 2015 at 19:01, xiaoliang tian xiaoliang.tian@gmail.com wrote:

The Version is 0.90.13

There is Out of memory error occurred when flushing shard

How could I resolve this error?

Any suggestion to prevent it.

Below is the cluster situation:
5 data node,1 master and 1 search node

Dozens of index and each index has more than 100GB data

Another problem is When someone try to query data there is connect
time-out problem,what could cause time out? I think concurrency is ort of
considered,Maybe due to the huge data?

Plz help

Below is OOM error
[2015-03-01 07:23:24,023][WARN ][index.translog ] [Outlaw]
[16494][4] failed to flush shard on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: [16494][4]
Flush failed
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:907)
at
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:563)
at
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4354)
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2891)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2954)
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:893)
... 5 more
[2015-03-01 07:23:27,078][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [engine failure, message
[OutOfMemoryError[Java heap space]]]
[2015-03-01 07:23:24,030][DEBUG][action.bulk ] [Outlaw]
[16494][4] failed to execute bulk item (index) index
{[16494][cs-us-east-1-logging-swc-rel][c22f6222-c146-4608-9dcc-c8846191c21a],
source[{"version":"0.2","role":"es-data","from":"cs-us-east-1-logging-swc-rel","host":"ip-10-1-33-94-us-east-1-compute-internal","type":"log","time":1425092107803,"level":"system","text":"
27 disks \n 2 partitions \n 47584725 total reads\n
80364 merged reads\n 3159251537 read sectors\n 586297450 milli
reading\n 634621170 writes\n 17059531 merged writes\n 49307983928
written sectors\n 2768108439 milli writing\n 0 inprogress IO\n
426354 milli spent
IO\n","state":"info","service":"snapshot","process":"VMstat","uid":"c22f6222-c146-4608-9dcc-c8846191c21a"}]}
org.elasticsearch.index.engine.IndexFailedEngineException: [16494][4]
Index failed for
[cs-us-east-1-logging-swc-rel#c22f6222-c146-4608-9dcc-c8846191c21a]
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:501)
at
org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:386)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:398)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:156)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
[2015-03-01 07:23:30,699][DEBUG][action.bulk ] [Outlaw]
[16494][4], node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED]: Failed to
execute [org.elasticsearch.action.bulk.BulkShardRequest@7fca9100]
java.lang.NullPointerException
at
org.elasticsearch.action.bulk.TransportShardBulkAction.applyVersion(TransportShardBulkAction.java:640)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:178)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-03-01 07:23:30,719][WARN ][indices.cluster ] [Outlaw]
[16494][4] master [[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true}] marked shard as
started, but shard has not been created, mark shard as failed
[2015-03-01 07:23:30,728][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [master
[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true} marked shard as
started, but shard has not been created, mark shard as failed]

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/06d33d9c-b684-4bd6-87f2-671b0704c763%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/06d33d9c-b684-4bd6-87f2-671b0704c763%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9eP3wdoHusTtOZYNG2E-x%2BWKKdVGN2_tW8ZC9OSJnckA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

xiaoliang_tian · March 2, 2015, 9:20am

Ok,If I upgrade how should I keep all old data

2015-03-02 16:36 GMT+08:00 Mark Walkom markwalkom@gmail.com:

You should really upgrade!

If you can, try deleting/closing some indices to reduce the load, also
given it's <1.X you can try disabling bloom filters.

On 2 March 2015 at 19:01, xiaoliang tian xiaoliang.tian@gmail.com wrote:

The Version is 0.90.13

There is Out of memory error occurred when flushing shard

How could I resolve this error?

Any suggestion to prevent it.

Below is the cluster situation:
5 data node,1 master and 1 search node

Dozens of index and each index has more than 100GB data

Another problem is When someone try to query data there is connect
time-out problem,what could cause time out? I think concurrency is ort of
considered,Maybe due to the huge data?

Plz help

Below is OOM error
[2015-03-01 07:23:24,023][WARN ][index.translog ] [Outlaw]
[16494][4] failed to flush shard on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: [16494][4]
Flush failed
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:907)
at
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:563)
at
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4354)
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2891)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2954)
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:893)
... 5 more
[2015-03-01 07:23:27,078][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [engine failure, message
[OutOfMemoryError[Java heap space]]]
[2015-03-01 07:23:24,030][DEBUG][action.bulk ] [Outlaw]
[16494][4] failed to execute bulk item (index) index
{[16494][cs-us-east-1-logging-swc-rel][c22f6222-c146-4608-9dcc-c8846191c21a],
source[{"version":"0.2","role":"es-data","from":"cs-us-east-1-logging-swc-rel","host":"ip-10-1-33-94-us-east-1-compute-internal","type":"log","time":1425092107803,"level":"system","text":"
27 disks \n 2 partitions \n 47584725 total reads\n
80364 merged reads\n 3159251537 read sectors\n 586297450 milli
reading\n 634621170 writes\n 17059531 merged writes\n 49307983928
written sectors\n 2768108439 milli writing\n 0 inprogress IO\n
426354 milli spent
IO\n","state":"info","service":"snapshot","process":"VMstat","uid":"c22f6222-c146-4608-9dcc-c8846191c21a"}]}
org.elasticsearch.index.engine.IndexFailedEngineException: [16494][4]
Index failed for
[cs-us-east-1-logging-swc-rel#c22f6222-c146-4608-9dcc-c8846191c21a]
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:501)
at
org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:386)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:398)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:156)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
[2015-03-01 07:23:30,699][DEBUG][action.bulk ] [Outlaw]
[16494][4], node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED]: Failed to
execute [org.elasticsearch.action.bulk.BulkShardRequest@7fca9100]
java.lang.NullPointerException
at
org.elasticsearch.action.bulk.TransportShardBulkAction.applyVersion(TransportShardBulkAction.java:640)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:178)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-03-01 07:23:30,719][WARN ][indices.cluster ] [Outlaw]
[16494][4] master [[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true}] marked shard as
started, but shard has not been created, mark shard as failed
[2015-03-01 07:23:30,728][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [master
[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true} marked shard as
started, but shard has not been created, mark shard as failed]

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/06d33d9c-b684-4bd6-87f2-671b0704c763%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/06d33d9c-b684-4bd6-87f2-671b0704c763%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/pMZfUOXgU4M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9eP3wdoHusTtOZYNG2E-x%2BWKKdVGN2_tW8ZC9OSJnckA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9eP3wdoHusTtOZYNG2E-x%2BWKKdVGN2_tW8ZC9OSJnckA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJ%3DxLsUFwMqQcLHacX6idq%3DHW6AhAy2dyewn7qdGW_8a-8PdpA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Out of Memory Error Elasticsearch	11	3033	July 5, 2017
Lost data due to out of memory error Elasticsearch	6	1408	July 6, 2017
OutOfMemoryError Leading To IndexShardMissingException Elasticsearch	3	440	July 6, 2017
Clarification in Elasticsearch OUTOFMEMORY warning? Elasticsearch	2	675	July 21, 2017
Best way to resolve this out of memory error in my ES Data Nodes Elasticsearch	3	880	March 6, 2021

OutOfMemory Error occurred When Flush shard

Related topics