I am trying to send data stored in pySpark Dataframe directly into Elasticsearch. I found some code snippets online on how to do this and basically it looks like this:
# df - DataFrame df.write.format("org.elasticsearch.spark.sql") \ .option("es.nodes", "ip_address") \ .option("es.resources", "test") .save()
but whatever I type in
option results in IllegalArgumentException. I am running scala 2.11.0, spark 2.3.4 on Google Cloud. My elasticsearch machine is on a separate machine ver. 7.4.1
I download jar
elasticsearch-spark-20_2.11-7.4.1.jar and placed it in spark jars directory (it previously worked that way with mongo and google cloud storage).
How to properly send pyspark DataFrame into Elasticsearch?