Loading pySpark DataFrame into Elasticsearch - keep getting illegal argument error

dkajtoch · November 18, 2019, 4:07pm

I am trying to send data stored in pySpark Dataframe directly into Elasticsearch. I found some code snippets online on how to do this and basically it looks like this:

# df - DataFrame
df.write.format("org.elasticsearch.spark.sql") \
   .option("es.nodes", "ip_address") \
   .option("es.resources", "test")
   .save()

but whatever I type in option results in IllegalArgumentException. I am running scala 2.11.0, spark 2.3.4 on Google Cloud. My elasticsearch machine is on a separate machine ver. 7.4.1
I download jar elasticsearch-spark-20_2.11-7.4.1.jar and placed it in spark jars directory (it previously worked that way with mongo and google cloud storage).
How to properly send pyspark DataFrame into Elasticsearch?

james.baiera · December 2, 2019, 6:04pm

This looks correct, could you share the IllegalArgumentException that you're getting here?

system · December 30, 2019, 6:04pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
org.elasticsearch.hadoop.rest.EsHadoopRemoteException: illegal_argument_exception: value for key [X-Opaque-Id] already present Elasticsearch es-hadoop	3	234	December 21, 2023
How to write to ES from a pyspark dataframe? Elasticsearch es-hadoop	5	5120	July 6, 2017
Error with pyspark connect es Elasticsearch es-hadoop	1	900	September 24, 2020
Saving DF to Elasticsearch usig python Elasticsearch es-hadoop	2	5410	April 8, 2017
Pyspark-Elasticsearch connectivity and latest version compatibilty Elasticsearch es-hadoop	7	2355	March 24, 2023

Loading pySpark DataFrame into Elasticsearch - keep getting illegal argument error

Related topics