Hi all,
I am trying to read data from Elasticsearch to Databricks (Spark) but I'm getting the following error:
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
which is symptomatic of a wrong driver version according to your documentation
I'm running
- Databricks runtime version 13.0 (includes Apache Spark 3.4.0, Scala 2.12)
*Elasticsearch version 8.5.2 - I thus installed org.elasticsearch:elasticsearch-spark-30_2.12:8.5.2 from Maven on the Databricks cluster
From a networking perspective, I’m able to telnet elastic.
However, I’m not able to pull data from Elastic server using the following command
df = (spark.read
.format( "org.elasticsearch.spark.sql" )
.option( "spark.es.nodes", hostname )
.option( "spark.es.port", port )
.option( "spark.es.nodes.wan.only", "true" )
.option("spark.es.net.ssl", "true")
.option("spark.es.net.http.auth.user", username)
.option("spark.es.net.http.auth.pass", password)
.load( f"{index}" )
)
display(df)