Elastichadoop not scaling


#1

Hi,
We are trying to get elastichadoop to work in our modest Spark environment.

Running a job with 30 cores and trying to write 1M documents to 4 node ES cluster fails. If the document size is small, like under 10,000 it works. These servers are VMs.

Here is the call (server names removed):
We tried a variety of config options with same results.

Any help is appreciated.

exportRDD.saveAsNewAPIHadoopFile(
path='-',
outputFormatClass="org.elasticsearch.hadoop.mr.EsOutputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
conf={ "es.resource" : OAINDEX+"/"+OATYPE, "es.mapping.id":"id", "es.input.json": "true","es.net.http.auth.user":"elastic","es.batch.size.entries":"0","es.write.operation":"index","es.nodes
.wan.only":"true","es.net.http.auth.pass":"changeme","es.nodes":ESNODES, "es.port":"9200" })

Exception:

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.saveAsNewAPIHadoopFile.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 11 in stage 106.0 failed 4 times, most recent failure: Lost task 11.3 in stage 106.0 (TID 6649, bdr-itwp-hdfs-5.dev.uspto.gov, executor 2): org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[server1,server2,server3,server4 (FQDN's removed)]]
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:149)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:461)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:445)
at org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:186)
at org.elasticsearch.hadoop.rest.RestRepository.tryFlush(RestRepository.java:222)
at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:244)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:184)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:161)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:151)
at )


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.