The Version is 0.90.13
There is Out of memory error occurred when flushing shard
How could I resolve this error?
Any suggestion to prevent it.
Below is the cluster situation:
5 data node,1 master and 1 search node
Dozens of index and each index has more than 100GB data
Another problem is When someone try to query data there is connect
time-out problem,what could cause time out? I think concurrency is ort of
considered,Maybe due to the huge data?
Plz help
Below is OOM error
[2015-03-01 07:23:24,023][WARN ][index.translog ] [Outlaw]
[16494][4] failed to flush shard on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: [16494][4] Flush
failed
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:907)
at
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:563)
at
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4354)
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2891)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2954)
at
org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:893)
... 5 more
[2015-03-01 07:23:27,078][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [engine failure, message
[OutOfMemoryError[Java heap space]]]
[2015-03-01 07:23:24,030][DEBUG][action.bulk ] [Outlaw]
[16494][4] failed to execute bulk item (index) index
{[16494][cs-us-east-1-logging-swc-rel][c22f6222-c146-4608-9dcc-c8846191c21a],
source[{"version":"0.2","role":"es-data","from":"cs-us-east-1-logging-swc-rel","host":"ip-10-1-33-94-us-east-1-compute-internal","type":"log","time":1425092107803,"level":"system","text":"
27 disks \n 2 partitions \n 47584725 total reads\n
80364 merged reads\n 3159251537 read sectors\n 586297450 milli
reading\n 634621170 writes\n 17059531 merged writes\n 49307983928
written sectors\n 2768108439 milli writing\n 0 inprogress IO\n
426354 milli spent
IO\n","state":"info","service":"snapshot","process":"VMstat","uid":"c22f6222-c146-4608-9dcc-c8846191c21a"}]}
org.elasticsearch.index.engine.IndexFailedEngineException: [16494][4] Index
failed for
[cs-us-east-1-logging-swc-rel#c22f6222-c146-4608-9dcc-c8846191c21a]
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:501)
at
org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:386)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:398)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:156)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
[2015-03-01 07:23:30,699][DEBUG][action.bulk ] [Outlaw]
[16494][4], node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED]: Failed to
execute [org.elasticsearch.action.bulk.BulkShardRequest@7fca9100]
java.lang.NullPointerException
at
org.elasticsearch.action.bulk.TransportShardBulkAction.applyVersion(TransportShardBulkAction.java:640)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:178)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2015-03-01 07:23:30,719][WARN ][indices.cluster ] [Outlaw]
[16494][4] master
[[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true}] marked shard as
started, but shard has not been created, mark shard as failed
[2015-03-01 07:23:30,728][WARN ][cluster.action.shard ] [Outlaw]
[16494][4] sending failed shard for [16494][4],
node[z-YGubBGRe2afo5G8MBPkQ], [P], s[STARTED], indexUUID
[9MjfwirySmWIbqT8clWDwQ], reason [master
[Tyrak][gJq80HQDTSKtZO56q81pIg][inet[/10.1.33.53:9300]]{data=false,
rack=rack_tone, max_local_storage_nodes=1, master=true} marked shard as
started, but shard has not been created, mark shard as failed]
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/06d33d9c-b684-4bd6-87f2-671b0704c763%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.