Getting Error on sc.esRDD ES-Hadoop 5.2.2 Spark version 2.1.0

Am I missing something in the --jars option? Or maybe Scala version in ES does not match Spark version?

I am using this to start the Scala shell in Spark:

spark-shell --master local --jars /usr/local/spark/elasticsearch-hadoop-5.2.2/dist/elasticsearch-spark-20_2.10-5.2.2.jar

And I can successfully import org.elasticsearch.spark._

But when I:

val RDD = sc.esRDD("filebeat-2017.03.26")
RDD.count()

I get

46 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
at org.elasticsearch.spark.rdd.AbstractEsRDDIterator.(AbstractEsRDDIterator.scala:28)
at org.elasticsearch.spark.rdd.ScalaEsRDDIterator.(ScalaEsRDD.scala:43)
at org.elasticsearch.spark.rdd.ScalaEsRDD.compute(ScalaEsRDD.scala:39)
at org.elasticsearch.spark.rdd.ScalaEsRDD.compute(ScalaEsRDD.scala:33)

Maybe there is some version mismatch?

Spark version 2.1.0
Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121)
ES-Hadoop 5.2.2
elasticsearch 5.2.2

Your ES-Hadoop artifact is the Scala 2.10 compatibility jar. The version numbers for the spark artifact are of the format elasticsearch-spark-{$sparkMajorMinorVersionNoDots}_{$scalaVersion}-{$esHadoopVersion}.jar. You will need to use elasticsearch-spark-20_2.11-5.2.2.jar instead of the _2.10 version.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.