Unable to write to Elasticsearch from Spark Java

Hi,

I have been following the documentation for writing data from Spark / Java into Elasticsearch mentioned here : https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html#spark-sql
But every time the documents are written it's just metadata and not the actual data from RDD.
Is there any other config required to write to Elasticsearch from Spark/Java ?

I'm using ES v7.5.1 with spark v2.2.1 and elasticsearch-spark-20_2.11 v7.5.1

Code :

 JavaSparkContext jsc = new JavaSparkContext(session.sparkContext());

// data to be saved
            Map<String, ?> otp = ImmutableMap.of("iata", "OTP", "name", "Otopeni");
            Map<String, ?> jfk = ImmutableMap.of("iata", "JFK", "name", "JFK NYC");


// create a pair RDD between the id and the docs
            JavaPairRDD<?, ?> pairRdd = jsc.parallelizePairs(ImmutableList.of(
                    new Tuple2<Object, Object>(1, otp),
                    new Tuple2<Object, Object>(2, jfk)));

            
JavaEsSpark.saveToEsWithMeta(pairRdd, "spark-index");

Documents created :

{
       "_index" : "spark-index",
       "_type" : "_doc",
       "_id" : "2",
       "_score" : 1.0
     },
     {
       "_index" : "spark-index",
       "_type" : "_doc",
       "_id" : "1",
       "_score" : 1.0
     }

Hello @angryninja,
I think the problem might be related to https://github.com/elastic/elasticsearch-hadoop/issues/913

Would it be possible to enable the logging on org.elasticsearch.hadoop.rest to trace and perform the same test? Doc

I looked at the issue but that is something else.
I am now able to write to ES using following statement but saveToEs (both methods) don't work and I don't understand why. Both methods only write the metadata to ES and nothing else.

dataSetObj.write().format("org.elasticsearch.spark.sql").options(elasticSearchWriteOption()).mode("Append").save("spark-index");

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.