[Hadoop] OOME when streaming from ES into Spark

Hi,

We've noticed that one task seems to fail with an OOME when it's reading
from ES on 5 partitions and trying to repartition out 200 partitions. Our
job uses 50 cores and allocates 2GB per executor.

ES Setup:
-1.4.1
-4 Nodes (r3.2xlarge)
-5 Shards
-We are using the default configurations documented
here: http://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html.
-Using ES-Hadoop 2.1.0 Beta3 against Spark 1.2.1

Below is the stack trace of what we typically see. We typically see one
failure and then a successful retry of the task.

Is there something in the RDD that would cause it to not off load data form
the heap and thus cause an OOME?

03/19/2015 16:10:47,304 [task-result-getter-2] WARN
org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 0.0
(TID 7, prod-contacts-cass-1b-2): java.lang.OutOfMemoryError: GC overhead
limit exceeded
at java.util.LinkedHashMap.createEntry(LinkedHashMap.java:442)
at java.util.HashMap.addEntry(HashMap.java:884)
at java.util.LinkedHashMap.addEntry(LinkedHashMap.java:427)
at java.util.HashMap.put(HashMap.java:505)
at
org.elasticsearch.hadoop.serialization.builder.JdkValueReader.addToMap(JdkValueReader.java:87)
at
org.elasticsearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:277)
at
org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:200)
at
org.elasticsearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:277)
at
org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:200)
at
org.elasticsearch.hadoop.serialization.ScrollReader.list(ScrollReader.java:241)
at
org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:203)
at
org.elasticsearch.hadoop.serialization.ScrollReader.map(ScrollReader.java:277)
at
org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:200)
at
org.elasticsearch.hadoop.serialization.ScrollReader.readHit(ScrollReader.java:156)
at
org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:102)
at
org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:81)
at
org.elasticsearch.hadoop.rest.RestRepository.scroll(RestRepository.java:314)
at org.elasticsearch.hadoop.rest.ScrollQuery.hasNext(ScrollQuery.java:76)
at
org.elasticsearch.spark.rdd.AbstractEsRDDIterator.hasNext(AbstractEsRDDIterator.scala:46)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at
org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:365)
at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3349d12c-e40c-40cd-a014-f9ccaa75b832%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.