Unable to create external table in elastic search using es-hadoop


(Gangadhar Dantusetty) #1

i am running a simple spark-submit job, e.g.:

spark-submit --class com.x.y.z.logan /home/test/spark/sample.jar

table in jar file

hiveContext.sql("CREATE EXTERNAL TABLE IF NOT EXISTS databasename.tablename(es_column_name STRING) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'index_name/log','es.mapping.names' ='tablecoulmname :es_column_name ', 'es.nodes' = '192.168.x.1y:9200','es.input.json' = 'false', 'es.index.read.missing.as.empty' = 'yes' ,'es.index.auto.create' = 'yes') ")

hiveContext.sql("INSERT INTO TABLE databasename.tablename SELECT s.tablecolumname FROM temporarytable s");

(ES-hadoop is working fine when we are creating external table in Hive interface and loading data into external table from hive table. it is not working when we include the same query in jar. the same jar file works fine when we are creating normal hive table. The problem here is showing the below error when we include external table in jar file )

ERROR
client token: N/A
diagnostics: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: class org.elasticsearch.hadoop.mr.EsOutputFormat$EsOutputCommitter not org.apache.hadoop.mapred.OutputCommitter
ApplicationMaster host: 192.168.x.y
ApplicationMaster RPC port: 0
queue: root.users.test
start time: 1485286033939
final status: FAILED
tracking URL: some URL
user: test
Exception in thread "main" org.apache.spark.SparkException: Application application_1485258812942_0008 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/24 14:27:39 INFO util.ShutdownHookManager: Shutdown hook called
17/01/24 14:27:39 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-c0cd21d0-ed0a-46c4-9171-2c6b7b55bc96

CAN ANYONE HELP ME HOW TO FIX THIS ?


(James Baiera) #2

@Gangadhar_dantusetty Is there any reason that you are using the Hive integration through Spark instead of the native Spark integration? The project does not do any testing with the hiveContext in Spark, so I'm afraid that my experience with that facet is limited...


(Gangadhar Dantusetty) #3

even i tried with the native spark intergration. But it is showing the below error.

     diagnostics: User class threw exception: java.lang.NoClassDefFoundError: org/elasticsearch/spark/package$
     ApplicationMaster host: 192.168.x.x
     ApplicationMaster RPC port: 0
     queue: root.users.test
     start time: 1485792227843
     final status: FAILED
     tracking URL: some url 
     user: test

Exception in thread "main" org.apache.spark.SparkException: Application application_1485790878058_0002 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/30 11:04:06 INFO util.ShutdownHookManager: Shutdown hook called
17/01/30 11:04:06 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-15c11bbd-b23c-4c2e-b42c-3110512db097


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.