SparkSQL to Elasticsearch compatibility problem

Eduardo_Curonisy · March 16, 2017, 11:54am

Hi,
I have one cluster with hadoop (cloudera 5.8) and another separate cluster with ES 5.2.2.
I want yo use Spark to write data from Hive to ES, but i have problems with Java version.

Java version: 1.7
ES version: 5.2.2
Spark: 1.6
Scala: 2.10

On Hadoop cluster i have java 1.7. On my POM file i use "elasticsearch-spark_2.10" connector version 2.2.1.

When i use elasticsearch-spark_2.10 version 2.2.1 i obtain the error:

2017-03-16 13:04:02 DEBUG EsDataFrameWriter:180 - Discovered Elasticsearch version [5.2.2]
2017-03-16 13:04:02 DEBUG HttpMethodBase:1024 - Resorting to protocol version default close connection policy
2017-03-16 13:04:02 DEBUG HttpMethodBase:1028 - Should NOT close connection, using HTTP/1.1
2017-03-16 13:04:02 DEBUG HttpConnection:1178 - Releasing connection back to connection manager.
2017-03-16 13:04:02 ERROR Executor:95 - Exception in task 1.1 in stage 5.0 (TID 1254)
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens when accessing a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:190)
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:379)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:55)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:55)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Unsupported/Unknown Elasticsearch version 5.2.2
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:185)
... 10 more

If i change my POM with elasticsearch-spark_2.10 version 5.0.0-alpha4 which is the version compatible with ES 5.2.2 i get another error:

ApplicationMaster:95 - User class threw exception: java.lang.UnsupportedClassVersionError: org/elasticsearch/spark/rdd/CompatUtils : Unsupported major.minor version 52.0
java.lang.UnsupportedClassVersionError: org/elasticsearch/spark/rdd/CompatUtils : Unsupported major.minor version 52.0

I think is because elasticsearch-spark_2.10 (both versions?) are compiled with java 1.8 and my environment is java 1.7? If is this, is there a way to re-compile the elasticsearch-spark_2.10 to Java 1.7?

Thanks.

james.baiera · March 16, 2017, 5:51pm

@Eduardo_Curonisy In the first case you provided, you are using version 2.2.1 of ES-Hadoop which can only interact with earlier versions of Elasticsearch (2.2 and below). So you are correct in that you need to upgrade to a newer version of the connector.

I would avoid using the *-alpha releases of the connector at this point. ES-Hadoop is released in lock step with Elasticsearch now, so version 5.2.2 is already out and will be the most compatible with your version of Elasticsearch. Generally, it's best to keep ES-Hadoop at the same version or higher (we support backwards compatibility).

It was also discovered early in the 5.0 release cycle that a change in the build process meant that Scala classes in the Spark package were being compiled for Java 8 instead of in compatibility mode for Java 6. This was fixed and should be correct in version 5.2.2.

system · April 13, 2017, 5:51pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
EsHadoopIllegalArgumentException: Cannot detect ES version Elasticsearch es-hadoop	2	561	October 6, 2023
Connector for Elastic Search 8.6.2 and databricks spark 3.4.0 Elasticsearch es-hadoop	9	1507	October 27, 2023
PySpark writing to ES: "Cannot detect ES version" Elasticsearch es-hadoop	10	131	July 24, 2024
Not able to detect ES version Elasticsearch es-hadoop	2	9121	September 11, 2018
Cannot detect ES version - using spark Elasticsearch	1	760	March 4, 2021

SparkSQL to Elasticsearch compatibility problem

Related topics