FileNotFoundException thrown while inserting into external ES table from Hive


(Sheryl John) #1

Hi,

I am using Hive 1.1.0 , Hadoop 2.6.0 and downloaded the elasticsearch-hadoop-2.1.0 jar file.

After following the instructions from the docs, I was able to create external table :

hive> CREATE EXTERNAL TABLE movies (                                                                            
> movieid INT,                                                                                              
> title STRING,
> genres STRING)
> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'                                                
> TBLPROPERTIES('es.resource' = 'movietime/movies');  

However, while trying to insert into the table I'm getting the following exception: https://gist.github.com/sherylj/ccd924be35537b4aa0e8

The elasticsearch-hadoop-2.1.0.jar is places in the $HIVE_HOME/lib. I've also specified the jar path in the hive-site.xml :

The jar is available on the HDFS in the above specified path. Hence, I'm confused as to why the error exception says the File is not found. Please let me know if I'm missing something or how I can fix it.

Thanks,
Sheryl


(Costin Leau) #2

Try using the public IP (192.x.x.x) of your HDFS node instead of localhost. An easy way to double check whether the file is actually there and readable by Hive is to access HDFS through the Hadoop web interface and check its size and location.
You could also use the hdfs command line to check this .

P.S. Thanks for a well formatted post :smile:


(Sheryl John) #3

Thanks for your reply Costin!
I tried using the IP but got the same error. My hdfs setup is standalone local and there's probably something wrong there which is causing the issue.
Instead of specifying the hdfs path to the jar, I updated hive-site.xml to point to local path of the elasticsearch-hadoop jar and that worked. I was able to write to the Elasticsearch index from a Hive table. Yay!

Not sure why the error was thrown if I gave a hdfs path to the jar. Could be a hdfs or hive issue.
Thanks!


(system) #4