Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: No class name given
at org.elasticsearch.hadoop.util.Assert.hasText(Assert.java:30)
at org.elasticsearch.hadoop.util.ObjectUtils.instantiate(ObjectUtils.java:32)
at org.elasticsearch.hadoop.util.ObjectUtils.instantiate(ObjectUtils.java:52)
at org.elasticsearch.hadoop.util.ObjectUtils.instantiate(ObjectUtils.java:48)
at org.elasticsearch.hadoop.serialization.bulk.AbstractBulkFactory.initExtractorsFromSettings(AbstractBulkFactory.java:198)
at org.elasticsearch.hadoop.serialization.bulk.AbstractBulkFactory.<init>(AbstractBulkFactory.java:174)
at org.elasticsearch.hadoop.serialization.bulk.IndexBulkFactory.<init>(IndexBulkFactory.java:27)
at org.elasticsearch.hadoop.serialization.bulk.BulkCommands.create(BulkCommands.java:39)
at org.elasticsearch.hadoop.rest.RestRepository.lazyInitWriting(RestRepository.java:130)
at org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:174)
at org.elasticsearch.hadoop.rest.RestRepository.delete(RestRepository.java:549)
at org.elasticsearch.spark.sql.ElasticsearchRelation.insert(DefaultSource.scala:481)
at org.elasticsearch.spark.sql.DefaultSource.createRelation(DefaultSource.scala:76)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139)
...
I'm not sure what does that error message means. I have followed the error thread which laid me to the following line
Assert.hasText(className, "No class name given"); // at org.elasticsearch.hadoop.util.ObjectUtils.instantiate(ObjectUtils.java:32)
Information about the data :
scala> data
// res0: org.apache.spark.sql.DataFrame = [doc_id: bigint, eventDate: timestamp, marketObjectId: bigint, eventType: string, merchantId: bigint, userId: bigint]
scala> data.printSchema
// root
// |-- doc_id: long (nullable = false)
// |-- eventDate: timestamp (nullable = true)
// |-- marketObjectId: long (nullable = true)
// |-- eventType: string (nullable = true)
// |-- merchantId: long (nullable = true)
// |-- userId: long (nullable = true)
code snippet :
val config: scala.collection.mutable.Map[String, String] =
scala.collection.mutable.Map(
"pushdown" -> "true",
"es.nodes" -> "localhost:9200", // params.esHost
"es.mapping.id" -> "doc_id"
)
data.write.format("org.elasticsearch.spark.sql")
.mode(SaveMode.Overwrite)
.options(config)
.save("clicks/event")
I'm using Spark 1.6.2 with elasticsearch-spark_2.10 v. 5.0.0.BUILD-SNAPSHOT in standalone mode.
{
"name" : "Callisto",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "5.0.0-alpha4",
"build_hash" : "3f5b994",
"build_date" : "2016-06-27T16:23:46.861Z",
"build_snapshot" : false,
"lucene_version" : "6.1.0"
},
"tagline" : "You Know, for Search"
}
The project is assembled with maven (mvn assembly:assembly) to create an uber-jar so all the dependencies are available.
EDIT : Reading data from elasticsearch works perfectly :
sqlContext.read.format("org.elasticsearch.spark.sql")
.options(config).load("clicks/event")
How can I fix this ?
Any help would be appreciated. Thanks !