i am running a simple spark-submit job, e.g.:
spark-submit --class com.x.y.z.logan /home/test/spark/sample.jar
table in jar file
hiveContext.sql("CREATE EXTERNAL TABLE IF NOT EXISTS databasename.tablename(es_column_name STRING) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'index_name/log','es.mapping.names' ='tablecoulmname :es_column_name ', 'es.nodes' = '192.168.x.1y:9200','es.input.json' = 'false', 'es.index.read.missing.as.empty' = 'yes' ,'es.index.auto.create' = 'yes') ")
hiveContext.sql("INSERT INTO TABLE databasename.tablename SELECT s.tablecolumname FROM temporarytable s");
(ES-hadoop is working fine when we are creating external table in Hive interface and loading data into external table from hive table. it is not working when we include the same query in jar. the same jar file works fine when we are creating normal hive table. The problem here is showing the below error when we include external table in jar file )
ERROR
client token: N/A
diagnostics: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: class org.elasticsearch.hadoop.mr.EsOutputFormat$EsOutputCommitter not org.apache.hadoop.mapred.OutputCommitter
ApplicationMaster host: 192.168.x.y
ApplicationMaster RPC port: 0
queue: root.users.test
start time: 1485286033939
final status: FAILED
tracking URL: some URL
user: test
Exception in thread "main" org.apache.spark.SparkException: Application application_1485258812942_0008 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/24 14:27:39 INFO util.ShutdownHookManager: Shutdown hook called
17/01/24 14:27:39 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-c0cd21d0-ed0a-46c4-9171-2c6b7b55bc96
CAN ANYONE HELP ME HOW TO FIX THIS ?