I want to connect Spark running on GCP dataproc to elasticsearch cluster running on GCE.
I'm new to dataproc and pySpark, so I got stuck on installing the Hadoop-elasticsearch adapter.
This is the package I'm trying to install.
org.elasticsearch:elasticsearch-spark-20_2.11:5.3.1
Do I do that in the initialisation actions, cluster properties or ssh into the machines and install?
Any help would is appreciated!
Thanks!