Connector for Elastic Search 8.6.2 and databricks spark 3.4.0

I am trying to connect Elastic Search using spark in data bricks but keep getting this error org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'.

The connector i installed is "org.elasticsearch:elasticsearch-spark-30_2.12:7.17.13"
Spark version on data bricks is 3.4.0

Elastic Search version is 8.6.2 and scala version on data bricks is 2.12.15

Any suggestion will be appreciated.

Thanks

Hi @luckymishra. You'll want your es-spark version to be compatible with elasticsearch 8.x. So you probably want org.elasticsearch:elasticsearch-spark-30_2.12:8.6.2.

thank you, i could add the library but still getting this error org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only

So it's working with es.nodes.wan.only set to true, and the list of nodes given? And you're wanting to run with that set to false instead? Are all of your spark nodes able to access all of your elasticsearch nodes?

i want to send you the code but the system is not letting me send the code .
no setting es.nodes.wan.only to true/false doesn't make any impact

just to let you know it's work perfectly if i call the connection like es_client=connections.create_connection(cloud_id= 'xxxxxxx',api_key= 'xxxxxx') but this can't be used with spark

i want to send you the code but the system is not letting me send the code

You've tried just pasting it in here in a formatted block? Maybe it's too much?

just to let you know it's work perfectly if i call the connection like es_client=connections.create_connection(cloud_id= 'xxxxxxx',api_key= 'xxxxxx') but this can't be used with spark

I'm not familiar with cloud_id. What kind of authentication are you using? I assume you've seen ES-Hadoop and Security | Elasticsearch Guide [8.11] | Elastic?

Hi Keith, apologies for not coming back, just to let you know i am still getting error which means the connector version is not correct. i am using the standard template to connect but the only difference is that elastic is hosted on cloud so i have address starting from https: , can you suggest something ?
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'

Have you set es.nodes.wan.only to true? Using es-hadoop/spark through cloud, elasticsearch appears to be a single node to es-hadoop. You might want to take a look at Writing PySpark dataframe to Elastic Cloud (Cannot detect ES version) for a discussion about some of the complications that come with that.

yes i set it to true, ok will go through the link
es_conf = {"es.nodes": "xxxx", "es.port": 9243, "es.nodes.wan.only": True }
es_conf["es.cloud.id"] = "xxxx"
es_conf["es.api.key"] = "xxx"
es_conf["es.net.ssl"] = "true"

df = spark.read.format("org.elasticsearch.spark.sql").options(**es_conf).load("abc")

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.