Unable to create external table in elastic search using es-hadoop

Gangadhar_dantusetty · January 25, 2017, 7:02am

i am running a simple spark-submit job, e.g.:

spark-submit --class com.x.y.z.logan /home/test/spark/sample.jar

table in jar file

hiveContext.sql("CREATE EXTERNAL TABLE IF NOT EXISTS databasename.tablename(es_column_name STRING) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'index_name/log','es.mapping.names' ='tablecoulmname :es_column_name ', 'es.nodes' = '192.168.x.1y:9200','es.input.json' = 'false', 'es.index.read.missing.as.empty' = 'yes' ,'es.index.auto.create' = 'yes') ")

hiveContext.sql("INSERT INTO TABLE databasename.tablename SELECT s.tablecolumname FROM temporarytable s");

(ES-hadoop is working fine when we are creating external table in Hive interface and loading data into external table from hive table. it is not working when we include the same query in jar. the same jar file works fine when we are creating normal hive table. The problem here is showing the below error when we include external table in jar file )

ERROR
client token: N/A
diagnostics: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: class org.elasticsearch.hadoop.mr.EsOutputFormat$EsOutputCommitter not org.apache.hadoop.mapred.OutputCommitter
ApplicationMaster host: 192.168.x.y
ApplicationMaster RPC port: 0
queue: root.users.test
start time: 1485286033939
final status: FAILED
tracking URL: some URL
user: test
Exception in thread "main" org.apache.spark.SparkException: Application application_1485258812942_0008 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/24 14:27:39 INFO util.ShutdownHookManager: Shutdown hook called
17/01/24 14:27:39 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-c0cd21d0-ed0a-46c4-9171-2c6b7b55bc96

CAN ANYONE HELP ME HOW TO FIX THIS ?

james.baiera · January 25, 2017, 11:04pm

@Gangadhar_dantusetty Is there any reason that you are using the Hive integration through Spark instead of the native Spark integration? The project does not do any testing with the hiveContext in Spark, so I'm afraid that my experience with that facet is limited...

Gangadhar_dantusetty · January 30, 2017, 10:43am

even i tried with the native spark intergration. But it is showing the below error.

     diagnostics: User class threw exception: java.lang.NoClassDefFoundError: org/elasticsearch/spark/package$
     ApplicationMaster host: 192.168.x.x
     ApplicationMaster RPC port: 0
     queue: root.users.test
     start time: 1485792227843
     final status: FAILED
     tracking URL: some url 
     user: test

Exception in thread "main" org.apache.spark.SparkException: Application application_1485790878058_0002 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/30 11:04:06 INFO util.ShutdownHookManager: Shutdown hook called
17/01/30 11:04:06 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-15c11bbd-b23c-4c2e-b42c-3110512db097

system · February 27, 2017, 10:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Added elasticsearch and hadoop and while creating external table getting error Elasticsearch es-hadoop	2	694	September 20, 2019
FileNotFoundException thrown while inserting into external ES table from Hive Elasticsearch es-hadoop	3	1413	July 6, 2017
ElasticsearchHadoop Hive integration issue Elasticsearch	3	693	July 6, 2017
Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask Elasticsearch es-hadoop	6	30077	November 23, 2017
Integration of Hive and Elasticsearch on cloudera Hadoop hive version 1.1.0 Elasticsearch es-hadoop	2	1625	July 6, 2017

Unable to create external table in elastic search using es-hadoop

Related topics