[HADOOP] Elasticsearch and hive

costin · May 20, 2014, 1:11pm

You don't need to add the hive jars (hive-serdes) to your script - these are already part of the runtime
use es-hadoop-2.0.RC1, 1.3 M2 is fairly old
Make sure that the jar is actually available and that is potentially accessible by Hive. Also do note that
when running Hive against a cluster, it's best to have the jars available on all nodes hence why most folks copy
them to HDFS and refer to that location.
While /home/hduser might be locally available, it might not exist on the Hive server - as indicated by the error.

Hope this helps,

On 5/20/14 3:58 PM, hanine haninne wrote:

Hello everybody

I m trying to take data from Hive and put it in elasticsearch
here is the script and the error

hive> ADD JAR /home/hduser/hadoop/hive-0.11.0/lib/hive-serdes-1.0-SNAPSHOT.jar;
Added /home/hduser/hadoop/hive-0.11.0/lib/hive-serdes-1.0-SNAPSHOT.jar to class path
Added resource: /home/hduser/hadoop/hive-0.11.0/lib/hive-serdes-1.0-SNAPSHOT.jar
hive> CREATE EXTERNAL TABLE tweetsES (
> id BIGINT,
> created_at STRING,
> source STRING,
> favorited BOOLEAN,
> retweet_count INT,
> retweeted_status STRUCT<
> text:STRING,
> user:STRUCT<screen_name:STRING,name:STRING>>,
> entities STRUCT<
> urls:ARRAY<STRUCT<expanded_url:STRING>>,
> user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
> hashtags:ARRAY<STRUCTtext:STRING>>,
> text STRING,
> user STRUCT<
> screen_name:STRING,
> name:STRING,
> friends_count:INT,
> followers_count:INT,
> statuses_count:INT,
> verified:BOOLEAN,
> utc_offset:INT,
> time_zone:STRING>,
> in_reply_to_screen_name STRING
> )
> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
> TBLPROPERTIES('es.resource' = 'reseaux_sociaux/tweets','es.input.json' = 'yes');
OK
Time taken: 4.292 seconds
hive> ADD JAR /home/hduser/hadoop/elasticsearch-hadoop-1.3.0.M2.jar;
Added /home/hduser/hadoop/elasticsearch-hadoop-1.3.0.M2.jar to class path
Added resource: /home/hduser/hadoop/elasticsearch-hadoop-1.3.0.M2.jar
hive> INSERT OVERWRITE TABLE tweetsES SELECT * FROM tweets;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.FileNotFoundException: File does not exist: /home/hduser/hadoop/elasticsearch-hadoop-1.3.0.M2.jar
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:558)
at org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:185)
at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getFileStatus(TrackerDistributedCacheManager.java:723)
at
org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:778)
at
org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestampsAndCacheVisibilities(TrackerDistributedCacheManager.java:755)
at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:843)
at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:734)
at org.apache.hadoop.mapred.JobClient.access$400(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:951)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:
/home/hduser/hadoop/elasticsearch-hadoop-1.3.0.M2.jar)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

That path it exist :/home/hduser/hadoop/elasticsearch-hadoop-1.3.0.M2.jar
Any help would be so appreciable
Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d1613b48-e58d-4077-a579-92b46a97bcff%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d1613b48-e58d-4077-a579-92b46a97bcff%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/537B549F.8020501%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Can't integrate Elasticsearch with Hive Elasticsearch	6	839	July 6, 2017
ElasticSearch-Hadoop:Loading data into Elasticsearch through hive querl showing DDLTask error Elasticsearch	3	663	July 6, 2017
ES-Hive Elasticsearch	9	447	July 6, 2017
Trouble to load data from my hadoop cluster to elasticsearch via pig and hive Elasticsearch	2	1507	July 6, 2017
[Hadoop]Writing to Elastic thanks to Hive Elasticsearch	4	646	July 6, 2017

[HADOOP] Elasticsearch and hive

Related topics