I am an Elasticsearch newbie trying to connect to Elasticsearch on GCP from databricks on AWS.
I tried following instructions provided by databricks (unable to post link here).
However, I am now running into the following error:
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
I am able to ping the elastic instance from databricks.
I am running on cluster databricks run time LTS 13.3 which uses Scala 2.12 and Spark 3.4.1.
I have installed elasticsearch-spark-30_2.12:8.11.4 library from maven repo.
All Elastic Cloud clusters have security enabled and it does not seem like you are providing any security settings.
I have not used the ES Hadoop connector but have looked at a few related issues and they all have .option("es.net.ssl", "true"), .option("es.net.http.auth.user", "elastic") and .option("es.net.http.auth.pass", "password") set. Can you try adding this to see if it makes any difference?
Thank you for pointing that out. I was thinking about security as well, but, I did not see any options pertaining to this while creating the instance. So, is the user and password "elastic" and "password" or do I need to set this user up in my elastic instance?
The username and password will depend on your cluster. I would suggest creating a new user with the correct privilages and use this. The values I provided are only examples.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.