Hi there,
I'm trying to push data from databricks/pyspark to elasticsearch following these instructions: ElasticSearch | Databricks on AWS
Unfortunately I'm getting this error:
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
Running a curl command from databricks is working and I see the pushed data in kibana. So connection is working in general. But how do I get from curl command to correct settings in python? I already tried to find the right options on Configuration | Elasticsearch for Apache Hadoop [8.11] | Elastic, but currently not successful.
This is the curl command:
curl --user username:passwd -X PUT -H "Content-Type: application/json" -d '{"name":"John Doe"}' http://my.url.com/elasticsearch/test_databricks/_doc/1
Pythoncode I tried:
df.write
.format( "org.elasticsearch.spark.sql" )
.option( "es.nodes", "my.url.com/elasticsearch/")
.option( "es.net.ssl", "false")
.option( "es.nodes.wan.only", "true" )
.option( "es.net.http.auth.user", "username")
.option( "es.net.http.auth.pass", "passwd")
.mode( "overwrite" )
.save( "index/test_databricks" )
Thank you in advance