I was trying to index pdf in scala
i have converted text in base64 encoding
also created RDD and we want to indexing in elastic search using this RDD
Is their any other solution to solve this
Getting this error:
Caused by: org.elasticsearch.hadoop.serialization.EsHadoopSerializationException: Cannot handle type [class java.lang.Character], instance [D] using writer [org.elasticsearch.spark.serialization.ScalaValueWriter@5f0eac1c]
at org.elasticsearch.hadoop.serialization.builder.ContentBuilder.value(ContentBuilder.java:63)
at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.doWriteObject(TemplatedBulk.java:71)
at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.write(TemplatedBulk.java:58)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:159)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:67)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:102)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:102)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:86)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)