Unable to write to Elasticsearch from Spark Java


I have been following the documentation for writing data from Spark / Java into Elasticsearch mentioned here : https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html#spark-sql
But every time the documents are written it's just metadata and not the actual data from RDD.
Is there any other config required to write to Elasticsearch from Spark/Java ?

I'm using ES v7.5.1 with spark v2.2.1 and elasticsearch-spark-20_2.11 v7.5.1

Code :

 JavaSparkContext jsc = new JavaSparkContext(session.sparkContext());

// data to be saved
            Map<String, ?> otp = ImmutableMap.of("iata", "OTP", "name", "Otopeni");
            Map<String, ?> jfk = ImmutableMap.of("iata", "JFK", "name", "JFK NYC");

// create a pair RDD between the id and the docs
            JavaPairRDD<?, ?> pairRdd = jsc.parallelizePairs(ImmutableList.of(
                    new Tuple2<Object, Object>(1, otp),
                    new Tuple2<Object, Object>(2, jfk)));

JavaEsSpark.saveToEsWithMeta(pairRdd, "spark-index");

Documents created :

       "_index" : "spark-index",
       "_type" : "_doc",
       "_id" : "2",
       "_score" : 1.0
       "_index" : "spark-index",
       "_type" : "_doc",
       "_id" : "1",
       "_score" : 1.0

Hello @angryninja,
I think the problem might be related to https://github.com/elastic/elasticsearch-hadoop/issues/913

Would it be possible to enable the logging on org.elasticsearch.hadoop.rest to trace and perform the same test? Doc

I looked at the issue but that is something else.
I am now able to write to ES using following statement but saveToEs (both methods) don't work and I don't understand why. Both methods only write the metadata to ES and nothing else.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.