Problem with retrieving data from Elasticsearch by Spark

Yasmeenc · February 1, 2019, 9:50pm

Hi Team,

I'm quite new to ES and ES-Hadoop. The code for pulling the data out from ES is like below:

es_read_conf = {
"es.nodes" : "",
"es.port" : "80",
"es.resource" : 'temprollover/rollover',
"es.input.json": "yes"
}

es_rdd = sc.newAPIHadoopRDD(
inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
conf=es_read_conf)

I am trying to retrieve data from Elasticsearch and trying to convert into RDD and getting the below error, Can you please help
Traceback (most recent call last):
File "/home/hadoop/rdd-spark.py", line 21, in
conf=es_read_conf)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 751, in newAPIHadoopRDD
File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in call
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: No data nodes with HTTP-enabled available
at org.elasticsearch.hadoop.rest.InitializationUtils.filterNonDataNodesIfNeeded(InitializationUtils.java:159)
at org.elasticsearch.hadoop.rest.RestService.findPartitions(RestService.java:223)
at org.elasticsearch.hadoop.mr.EsInputFormat.getSplits(EsInputFormat.java:412)

james.baiera · February 11, 2019, 9:42pm

The error message here points to an issue with your elasticsearch cluster: ES-Hadoop cannot find any nodes that are datanodes and also support communicating over http. I would double check your deployment to ensure that those nodes exist and are reachable from ES-Hadoop.

system · March 11, 2019, 9:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with retrieving data from ES into Spark Elasticsearch es-hadoop	3	4246	July 6, 2017
ES-Hadoop PySpark error Elasticsearch es-hadoop	2	2180	January 10, 2018
Can you help to check this error please? Elasticsearch es-hadoop	4	1474	July 6, 2017
Load data from spark to ElasticSearch Hadoop Elasticsearch es-hadoop	1	1108	July 6, 2017
Elasticsearch-hadoop pyspark [Hadoop] Elasticsearch	1	432	July 6, 2017

Problem with retrieving data from Elasticsearch by Spark

Related topics