Error saving Spark RDD using rdd.saveToEs

(Antonio Ye) #1

I have a simple Spark data frame which contains a column of JSON strings. When I run the following code:

import org.elasticsearch.spark._
val df = Seq("""{"name": "john", "age":44}""").toDF("json")>x.getAs[String]("json")).saveToEs("test-index-df/post")

I get the error below:

17/08/13 21:51:58 ERROR TaskContextImpl: Error in TaskCompletionListener Found unrecoverable error [] returned Bad Request(400) - failed to parse; Bailing out..
at org.elasticsearch.spark.rdd.EsRDDWriter$$anonfun$write$1.apply(EsRDDWriter.scala:60)
at org.elasticsearch.spark.rdd.EsRDDWriter$$anonfun$write$1.apply(EsRDDWriter.scala:60)
at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:123)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:97)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:95)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)

Anyone have any idea what I am doing wrong?

(James Baiera) #2

I would make sure that the JSON you are saving does not contain any non printing characters or unicode characters in standard json token locations (Like special quote characters)

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.