esRDD .count() is not working with my setup

Hi all!

This is the simplest sample I'm trying to get working (code + maven dependencies)

And it's crushing on .count() call with stack
java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
at org.elasticsearch.spark.rdd.AbstractEsRDDIterator.(AbstractEsRDDIterator.scala:28)
at org.elasticsearch.spark.rdd.ScalaEsRDDIterator.(ScalaEsRDD.scala:43)
at org.elasticsearch.spark.rdd.ScalaEsRDD.compute(ScalaEsRDD.scala:39)
at org.elasticsearch.spark.rdd.ScalaEsRDD.compute(ScalaEsRDD.scala:33)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.executor.Executor$
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$

It seems to be an incompatibility issue but I can't understand where is it.

Please assist me.

Thanks, Art.

This error typically occurs when using a library with an incompatible version of Scala. I'm assuming your Scala version is 2.11 since that's the spark compatibility level you have in your dependencies. I would change your es dependency to org.elasticsearch:elasticsearch-spark-20:5.2.0. If you deploy third party jars to your cluster, make sure all nodes have this updated artifact.

Hey @james.baiera,

Thanks for your help. Actually I figured out this the hard way :slight_smile:

For future references this is the correct dependency to include into maven's pom


Best, Artem.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.