Elastic Stack 5.4.0 is breaking for Spark 2.1.1

Does the current library support spark 2.1.1 .
The code is breaking for me . Printschema works fine , but show and collect breaks.

Could you link the logs/stacktraces that you are seeing? Have you upgraded from a previous version or is this a fresh install?

James

I did fresh installation of Spark 2.1.1 and installed ELK 5.4 and EShadoop
5.4. Previuous version I had issues with reading child array into dataframe
. I can printschema but but select crashes.

Have upgrade to the lastest version of spark and elk i am unable to run
basic select from dataframe. please refer below stack.
Traceback (most recent call last):
File "/Users/anupamjaiswal/Documents/aj/pyspk.py", line 28, in
df.select("tags").show()
File
"/usr/local/Cellar/apache-spark/2.1.1/libexec/python/lib/pyspark.zip/pyspark/sql/dataframe.py",
line 318, in show
print(self._jdf.showString(n, 20))
File
"/usr/local/Cellar/apache-spark/2.1.1/libexec/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py",
line 1133, in call
answer, self.gateway_client, self.target_id, self.name)
File
"/usr/local/Cellar/apache-spark/2.1.1/libexec/python/lib/pyspark.zip/pyspark/sql/utils.py",
line 63, in deco
return f(*a, **kw)
File
"/usr/local/Cellar/apache-spark/2.1.1/libexec/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py",
line 319, in get_return_value
format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o37.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage
0.0 (TID 0, localhost, executor driver): java.lang.NoClassDefFoundError:
scala/collection/GenTraversableOnce$class
at
org.elasticsearch.spark.rdd.AbstractEsRDDIterator.(AbstractEsRDDIterator.scala:28)
at
org.elasticsearch.spark.sql.ScalaEsRowRDDIterator.(ScalaEsRowRDD.scala:49)
at org.elasticsearch.spark.sql.ScalaEsRowRDD.compute(ScalaEsRowRDD.scala:45)
at org.elasticsearch.spark.sql.ScalaEsRowRDD.compute(ScalaEsRowRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)

Which version of Scala are you running Spark on top of and which specific version of ES-Hadoop are you using? Generally seeing java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class means that you are mismatching scala versions 2.10 and 2.11.

We are using pyspark for coding and have Scala version 2.11.8

Do we have any update ?

Sorry for disappearing there. This error almost always means that you are using the incorrect version of ES-Hadoop for your version of Scala. Do you have the artifact name for elasticsearch-spark that you are using? It should end in _2.11

My Bad was using wrong version . Do you have any working example of
exploding nested json stored in elastic , will appreciate your help.

I have the same problem, my Scala version and elasticsearch-spark are both in 2.11.
I can successfully insert data to elasticsearch, but I can't read data from elasticsearch.
(with Elastic Stack 5.4.0 and Spark 2.1.1)

My code:

val es_df = sqc.read.format("org.elasticsearch.spark.sql").load("test/daily_player_game")

My Maven Dependencies:

		<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
		<dependency>
			<groupId>org.apache.spark</groupId>
			<artifactId>spark-core_2.11</artifactId>
			<version>2.1.1</version>
		</dependency>

		<dependency>
			<groupId>org.apache.spark</groupId>
			<artifactId>spark-streaming_2.11</artifactId>
			<version>2.1.1</version>
		</dependency>

		<!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch-hadoop -->
		<dependency>
			<groupId>org.elasticsearch</groupId>
			<artifactId>elasticsearch-spark-20_2.11</artifactId>
			<version>5.3.1</version>
		</dependency>

Error Message:

17/06/13 10:01:04 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 192.168.1.183, executor 2): java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
	at org.elasticsearch.spark.rdd.AbstractEsRDDIterator.<init>(AbstractEsRDDIterator.scala:28)
	at org.elasticsearch.spark.sql.ScalaEsRowRDDIterator.<init>(ScalaEsRowRDD.scala:49)
	at org.elasticsearch.spark.sql.ScalaEsRowRDD.compute(ScalaEsRowRDD.scala:45)
	at org.elasticsearch.spark.sql.ScalaEsRowRDD.compute(ScalaEsRowRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: scala.collection.GenTraversableOnce$class
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

@byakuinss, are you sure that there are no conflicting jars on your cluster?

Yes. There is only one node in my elk cluster and it's a new server which does not have any old version jars before.

Thank you infomation.
Very good.


Sbobet

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.