Unable to get elasticsearch-hadoop working with Hive/Beeline

Hi!

I've followed the various guides to get going with the
elasticsearch-hadoop-integration in Hive, but I run into some issue:

add jar hdfs://host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar;
INFO : converting to local hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar

INFO : Added [/tmp/15207d6b-e4b5-446b-bbe2-cff282056983_resources/
elasticsearch-hadoop-hive-2.1.0.Beta4.jar] to class path

INFO : Added resources: [hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar]

No rows affected (0.122 seconds)

Then I am able to create an external table:

CREATE EXTERNAL table estest (field STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'hadoop/hadoop', 'es.index.auto.create' =
'false') ;

No rows affected (0.094 seconds)

However, when I try to interact I get this error:

select * from estest;
Error: java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/
serialization/dto/Node (state=,code=0)

As you can see I've followed the recommendation to put the jar file in
HDFS, and it seems like the jar is picked up in the classpath since without
the 'add jar' I get another error stating that the EsStorageHandler can't
be found. Any clues as to why this is happening?

-ra

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9c88299a-8646-4aa0-ba65-aa834d542dff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

It seems you are running into a classpath problem. The class mentioned in the exception
(org/elasticsearch/hadoop/serialization/dto/Node) is part of the elasticsearch-hadoop-hive-XXX. jar - you can verify
this yourself.
The fact that it is not found at runtime suggests that the a different or incomplete jar is used instead. This can occur
for example if a different jar is available in the Hive/Hadoop classpath which is picked up automatically and overrides
the one you use in your script.

So first try and double check the existing classpath - in the vast majority of Hive problems, this was the issue (and
old version was picked up instead). You can also verify this by trying to register the table - you should get an
exception right away. Once that's done, try different ways of adding the jar to your script classpath - it might be that
beeline has a different mechanism than vanilla Hive.

Hope this helps,

On 4/29/15 12:58 AM, Rasmus Aveskogh wrote:

Hi!

I've followed the various guides to get going with the elasticsearch-hadoop-integration in Hive, but I run into some
issue:

|

add jar hdfs://host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar;
INFO :converting to localhdfs://host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar

INFO :Added[/tmp/15207d6b-e4b5-446b-bbe2-cff282056983_resources/elasticsearch-hadoop-hive-2.1.0.Beta4.jar]to classpath

INFO :Addedresources:[hdfs://host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar]

Norows affected (0.122seconds)
|

Then I am able to create an external table:

|

CREATE EXTERNAL table estest (field STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'='hadoop/hadoop','es.index.auto.create'='false');

Norows affected (0.094seconds)

|

However, when I try to interact I get this error:

|

select*fromestest;
Error:java.lang.NoClassDefFoundError:org/elasticsearch/hadoop/serialization/dto/Node(state=,code=0)

|

As you can see I've followed the recommendation to put the jar file in HDFS, and it seems like the jar is picked up in
the classpath since without the 'add jar' I get another error stating that the EsStorageHandler can't be found. Any
clues as to why this is happening?

-ra

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9c88299a-8646-4aa0-ba65-aa834d542dff%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9c88299a-8646-4aa0-ba65-aa834d542dff%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/55400593.4000107%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks. We got it working by adding the jar to the hive-config, rather than
by "add jar" ..

-ra

Den onsdag 29 april 2015 kl. 00:11:47 UTC+2 skrev Costin Leau:

Hi,

It seems you are running into a classpath problem. The class mentioned in
the exception (org/elasticsearch/hadoop/serialization/dto/Node) is part of
the elasticsearch-hadoop-hive-XXX. jar - you can verify this yourself.
The fact that it is not found at runtime suggests that the a different or
incomplete jar is used instead. This can occur for example if a different
jar is available in the Hive/Hadoop classpath which is picked up
automatically and overrides the one you use in your script.

So first try and double check the existing classpath - in the vast
majority of Hive problems, this was the issue (and old version was picked
up instead). You can also verify this by trying to register the table - you
should get an exception right away. Once that's done, try different ways of
adding the jar to your script classpath - it might be that beeline has a
different mechanism than vanilla Hive.

Hope this helps,

On 4/29/15 12:58 AM, Rasmus Aveskogh wrote:

Hi!

I've followed the various guides to get going with the
elasticsearch-hadoop-integration in Hive, but I run into some issue:

add jar hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar;
INFO : converting to local hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar

INFO : Added [/tmp/15207d6b-e4b5-446b-bbe2-cff282056983_resources/
elasticsearch-hadoop-hive-2.1.0.Beta4.jar] to class path

INFO : Added resources: [hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar]

No rows affected (0.122 seconds)

Then I am able to create an external table:

CREATE EXTERNAL table estest (field STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'hadoop/hadoop', 'es.index.auto.create' =
'false') ;

No rows affected (0.094 seconds)

However, when I try to interact I get this error:

select * from estest;
Error: java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/
serialization/dto/Node (state=,code=0)

As you can see I've followed the recommendation to put the jar file in
HDFS, and it seems like the jar is picked up in the classpath since without
the 'add jar' I get another error stating that the EsStorageHandler can't
be found. Any clues as to why this is happening?

-ra

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9c88299a-8646-4aa0-ba65-aa834d542dff%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9c88299a-8646-4aa0-ba65-aa834d542dff%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b5d254e-9767-4bc0-8137-246501fb923c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Glad to hear it got solved. Btw, what version of hive/beeline/hadoop are
you using?
On Apr 29, 2015 3:18 PM, "Rasmus Aveskogh" aveskogh@gmail.com wrote:

Thanks. We got it working by adding the jar to the hive-config, rather
than by "add jar" ..

-ra

Den onsdag 29 april 2015 kl. 00:11:47 UTC+2 skrev Costin Leau:

Hi,

It seems you are running into a classpath problem. The class mentioned in
the exception (org/elasticsearch/hadoop/serialization/dto/Node) is part of
the elasticsearch-hadoop-hive-XXX. jar - you can verify this yourself.
The fact that it is not found at runtime suggests that the a different or
incomplete jar is used instead. This can occur for example if a different
jar is available in the Hive/Hadoop classpath which is picked up
automatically and overrides the one you use in your script.

So first try and double check the existing classpath - in the vast
majority of Hive problems, this was the issue (and old version was picked
up instead). You can also verify this by trying to register the table - you
should get an exception right away. Once that's done, try different ways of
adding the jar to your script classpath - it might be that beeline has a
different mechanism than vanilla Hive.

Hope this helps,

On 4/29/15 12:58 AM, Rasmus Aveskogh wrote:

Hi!

I've followed the various guides to get going with the
elasticsearch-hadoop-integration in Hive, but I run into some issue:

add jar hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar;
INFO : converting to local hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar

INFO : Added [/tmp/15207d6b-e4b5-446b-bbe2-cff282056983_resources/
elasticsearch-hadoop-hive-2.1.0.Beta4.jar] to class path

INFO : Added resources: [hdfs:
//host:9000//lib/elasticsearch-hadoop-hive-2.1.0.Beta4.jar]

No rows affected (0.122 seconds)

Then I am able to create an external table:

CREATE EXTERNAL table estest (field STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'hadoop/hadoop', 'es.index.auto.create' =
'false') ;

No rows affected (0.094 seconds)

However, when I try to interact I get this error:

select * from estest;
Error: java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/
serialization/dto/Node (state=,code=0)

As you can see I've followed the recommendation to put the jar file in
HDFS, and it seems like the jar is picked up in the classpath since without
the 'add jar' I get another error stating that the EsStorageHandler can't
be found. Any clues as to why this is happening?

-ra

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9c88299a-8646-4aa0-ba65-aa834d542dff%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9c88299a-8646-4aa0-ba65-aa834d542dff%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7b5d254e-9767-4bc0-8137-246501fb923c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7b5d254e-9767-4bc0-8137-246501fb923c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJogdmcO3fNOP6B9koRJf4mcnBuR7BM-bRXsJbuzffWVD0b%3DVA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.