Problems connecting to ES from Databricks using spark connector

Hi all,

I am trying to connect with ES from our Databricks cluster.

I have successfully installed elasticsearch-spark-30_2.12:8.4.3 on the cluster and confirmed that the elastic version == 8.3.4.

I am able to query data from ES using post commands e.g.

curl --user 'username:password' -X GET 'https://my-elasitic-url.elastic-cloud.com/index/_search' -H 'Accept: application/json' -H 'Content-type: application/json' -d '{"query": {"query_string": {"query": "my_query"}}}'

works.

however, running the following code

df = (spark.read
      .format( "org.elasticsearch.spark.sql" )
      .option( "es.nodes",  https://my-elasitic-url.elastic-cloud.com)
      .option( "es.net.ssl", "false")
      .option( "es.nodes.wan.only", "true" )
      .option( "es.net.http.auth.user", username)
      .option( "es.net.http.auth.pass", password)
      .load(headers["index"])
     )

results in this error.

org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'

any suggestions?

Hi @lhfo . What is the full stack trace? There ought to be a "Caused by" that will tell us a little more. Also, is your elasticsearch cluster accessible from the node where your spark driver is running?

I solved this issue by setting

 .option("es.port","443")
 .option( "es.net.ssl", "true")

@Keith_Massey , thanks for the reply!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.