While executing the spark Job in cluster mode Cannot detect ES version hosted by AWS


#1

While I am executing a spark job to load data from Json to ES cluster hosted by AWS using cluster mode with the below mentioned configuration settings getting error as

Error :- "py4j.protocol.Py4JJavaError: An error occurred while calling o2519.save.
: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only' "

Configuration settings : -
DataFrame.write.format("org.elasticsearch.spark.sql")
.option("es.nodes.wan.only", "true")
.option("es.port", "443")
.option("es.net.ssl", "true")
.option("es.net.ssl.cert.allow.self.signed", "true")
.option("es.nodes", "ssssss.us-east-2.es.amazonaws.com")
.option("es.mapping.id", "doc_id")
.mode("append")
.save("index/type")

Though I mentioned "es.nodes.wan.only", "true" it is getting ignored in execution time. Any help would be appreciated.


(Mark Walkom) #2

This might be a limitation of the AWS service, as they don't offer all of the Elasticsearch API's or functionality.

You may need to also ask them.


(system) closed #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.