Databricks Spark SQL and Elasticsearch

Is there any documentation related to Databricks Spark/Spark SQL integration with elasticsearch?

This is not something our documentation covers, no. You might need to ask the databricks team.

I could see the documentation related to integration with Hadoop - Elasticsearch for Hadoop | Elastic

So was checking if there is similar documentation for Databricks/Spark integration as well.

Sure. Thanks.

I assume you have seen Apache Spark support | Elasticsearch for Apache Hadoop [8.6] | Elastic? Es-spark is just a library. You just need to figure out how to get it into the classpath of whatever you are using to interact with spark (whether that's spark-shell, your own application, or a notebook). For example to try it out in spark shell (regardless of whether it's Cloudera or Databricks or plain Apache) you can just do something like this:

/home/keith/spark-3.2.1-bin-hadoop3.2/bin/spark-shell --master yarn --deploy-mode client --jars /home/keith/elasticsearch-spark-30_2.12-8.6.0.jar

Configuration for talking with your Elasticsearch cluster is then done in your spark code (see that first link above). If you can describe how you're trying to use it we might be able to give you more help. But as Mark said, we don't document all of the different Spark distributions or ways to use Spark.

Thanks Keith. This helps.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.