Problem with retrieving data from Elasticsearch by Spark

(Yasmeen Chakrayapeta) #1

Hi Team,

I'm quite new to ES and ES-Hadoop. The code for pulling the data out from ES is like below:

es_read_conf = {
"es.nodes" : "",
"es.port" : "80",
"es.resource" : 'temprollover/rollover',
"es.input.json": "yes"

es_rdd = sc.newAPIHadoopRDD(

I am trying to retrieve data from Elasticsearch and trying to convert into RDD and getting the below error, Can you please help
Traceback (most recent call last):
File "/home/hadoop/", line 21, in
File "/usr/lib/spark/python/lib/", line 751, in newAPIHadoopRDD
File "/usr/lib/spark/python/lib/", line 1257, in call
File "/usr/lib/spark/python/lib/", line 63, in deco
File "/usr/lib/spark/python/lib/", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: No data nodes with HTTP-enabled available

(James Baiera) #2

The error message here points to an issue with your elasticsearch cluster: ES-Hadoop cannot find any nodes that are datanodes and also support communicating over http. I would double check your deployment to ensure that those nodes exist and are reachable from ES-Hadoop.

(system) closed #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.