Pyspark: from curl to correct settings

Hi there,

I'm trying to push data from databricks/pyspark to elasticsearch following these instructions: ElasticSearch | Databricks on AWS

Unfortunately I'm getting this error:

org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'

Running a curl command from databricks is working and I see the pushed data in kibana. So connection is working in general. But how do I get from curl command to correct settings in python? I already tried to find the right options on Configuration | Elasticsearch for Apache Hadoop [8.11] | Elastic, but currently not successful.

This is the curl command:

curl --user username:passwd -X PUT -H "Content-Type: application/json" -d '{"name":"John Doe"}' http://my.url.com/elasticsearch/test_databricks/_doc/1

Pythoncode I tried:

df.write
  .format( "org.elasticsearch.spark.sql" )
  .option( "es.nodes",   "my.url.com/elasticsearch/")
  .option( "es.net.ssl", "false")
  .option( "es.nodes.wan.only", "true" )
  .option( "es.net.http.auth.user", "username")
  .option( "es.net.http.auth.pass", "passwd")
  .mode( "overwrite" )
  .save( "index/test_databricks" )

Thank you in advance

You can get that error for a variety of reasons. Look for a caused by stack trace that might give you more information. I believe it willl be in your spark driver log.

Using the option es.nodes.path.prefix fixed the issue:

df.write
  .format( "org.elasticsearch.spark.sql" )
  .option( "es.nodes",   "my.url.com")
  .option( "es.nodes.path.prefix", "elasticsearch" ) 
  .option( "es.net.ssl", "false")
  .option( "es.nodes.wan.only", "true" )
  .option( "es.net.http.auth.user", "username")
  .option( "es.net.http.auth.pass", "passwd")
  .mode( "overwrite" )
  .save( "/test_databricks" )

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.