Unable to write to Elasticsearch from Spark Java

angryninja · May 5, 2020, 10:18pm

Hi,

I have been following the documentation for writing data from Spark / Java into Elasticsearch mentioned here : https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html#spark-sql
But every time the documents are written it's just metadata and not the actual data from RDD.
Is there any other config required to write to Elasticsearch from Spark/Java ?

I'm using ES v7.5.1 with spark v2.2.1 and elasticsearch-spark-20_2.11 v7.5.1

Code :

 JavaSparkContext jsc = new JavaSparkContext(session.sparkContext());

// data to be saved
            Map<String, ?> otp = ImmutableMap.of("iata", "OTP", "name", "Otopeni");
            Map<String, ?> jfk = ImmutableMap.of("iata", "JFK", "name", "JFK NYC");


// create a pair RDD between the id and the docs
            JavaPairRDD<?, ?> pairRdd = jsc.parallelizePairs(ImmutableList.of(
                    new Tuple2<Object, Object>(1, otp),
                    new Tuple2<Object, Object>(2, jfk)));

            
JavaEsSpark.saveToEsWithMeta(pairRdd, "spark-index");

Documents created :

{
       "_index" : "spark-index",
       "_type" : "_doc",
       "_id" : "2",
       "_score" : 1.0
     },
     {
       "_index" : "spark-index",
       "_type" : "_doc",
       "_id" : "1",
       "_score" : 1.0
     }

Luca_Belluccini · May 6, 2020, 12:43am

Hello @angryninja,
I think the problem might be related to https://github.com/elastic/elasticsearch-hadoop/issues/913

Would it be possible to enable the logging on org.elasticsearch.hadoop.rest to trace and perform the same test? Doc

angryninja · May 6, 2020, 8:53pm

I looked at the issue but that is something else.
I am now able to write to ES using following statement but saveToEs (both methods) don't work and I don't understand why. Both methods only write the metadata to ES and nothing else.

dataSetObj.write().format("org.elasticsearch.spark.sql").options(elasticSearchWriteOption()).mode("Append").save("spark-index");

system · June 3, 2020, 8:53pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Load data from spark to ElasticSearch Hadoop Elasticsearch es-hadoop	1	1093	July 6, 2017
No able to write in Elasticsearch index using Spark 2.0.0 Elasticsearch es-hadoop	2	2035	July 6, 2017
Writing to Elasticsearch from Spark failing Elasticsearch es-hadoop	1	489	July 21, 2020
Issue with writing Spark dataset to Elastic Elasticsearch	3	1601	May 22, 2020
Writing spark Dataframe/Dataset to Elasticsearch Elasticsearch es-hadoop	2	1771	June 27, 2018

Unable to write to Elasticsearch from Spark Java

Related topics