Can't integrate Elasticsearch with Hive

Hi All,

I am using Hive 0.13.1 and trying to create an external table so data can
me loaded from Hive to Elasticsearch. However I keep getting the following
error. I have tried with following jars but same error. I will really
appreciate for any pointers.

Thanks

  • Atul
hive.aux.jars.path <!--

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar
-->

/apps/sas/elasticsearch-hadoop-2.1.0.Beta3/dist/elasticsearch-hadoop-2.1.0.Beta3.jar
A comma separated list (with no spaces) of the jar
files

ERROR :

2014-11-26 23:09:22,069 ERROR [main]: exec.DDLTask
(DDLTask.java:execute(478)) - java.lang.IllegalAccessError: tried to access
class org.elasticsearch.hadoop.hive.HiveUtils from class
org.elasticsearch.hadoop.hive.EsSerDe
at org.elasticsearch.hadoop.hive.EsSerDe.initialize(EsSerDe.java:81)
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:339)
at
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:288)
at
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:281)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:631)
at
org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)
at
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

2014-11-26 23:09:22,069 ERROR [main]: ql.Driver
(SessionState.java:printError(545)) - FAILED: Execution Error, return code
1 from org.apache.hadoop.hive.ql.exec.DDLTask. tried to access class
org.elasticsearch.hadoop.hive.HiveUtils from class
org.elasticsearch.hadoop.hive.EsSerDe

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

The issue is most likely caused by two different versions of es-hadoop within your classpath, probably es-hadoop 2.0.x
(2.0.2)
and 2.1.x (2.1.0.Beta3). If they are picked up by Hive or Hadoop it means the JVM will have two jars with classes under
the same package name.
This leads to weird conflicts as classes from jar can interact with classes from the other jar, especially as between
2.0.x/2.1.x the code internally
went through major changes.

Make sure you have only one version of es-hadoop in your classpath - both on the client and in the cluster. That
includes the Hive classpath, Hadoop classpath
as well as the submitting jar (since the library might be embedded).

P.S. IllegalAccesException indicates an illegal call - such as calling a non-public class in a different class. However
in this case both classes are in the same
package and HiveUtils class is not private...

Cheers,

On 11/27/14 9:19 AM, Atul Paldhikar wrote:

Hi All,

I am using Hive 0.13.1 and trying to create an external table so data can me loaded from Hive to Elasticsearch. However
I keep getting the following error. I have tried with following jars but same error. I will really appreciate for any
pointers.

Thanks

  • Atul
hive.aux.jars.path /apps/sas/elasticsearch-hadoop-2.1.0.Beta3/dist/elasticsearch-hadoop-2.1.0.Beta3.jar A comma separated list (with no spaces) of the jar files

ERROR :

2014-11-26 23:09:22,069 ERROR [main]: exec.DDLTask (DDLTask.java:execute(478)) - java.lang.IllegalAccessError: tried to
access class org.elasticsearch.hadoop.hive.HiveUtils from class org.elasticsearch.hadoop.hive.EsSerDe
at org.elasticsearch.hadoop.hive.EsSerDe.initialize(EsSerDe.java:81)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:339)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:288)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:281)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:631)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

2014-11-26 23:09:22,069 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. tried to access class org.elasticsearch.hadoop.hive.HiveUtils from
class org.elasticsearch.hadoop.hive.EsSerDe

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5477B3A1.4040700%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Costin,

thanks for your response. I tried all the cleanup but still no luck :frowning:
Here are the steps I tried

  1. Removed the es-hadoop 2.1.x completed from the server

  2. Updated the hive-site.xml as below, but didn't work

hive.aux.jars.path

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar
A comma separated list (with no spaces) of the jar
files

  1. Added the jar file in the hiveconf as below, still the same issue

hive --hiveconf
hive.aux.jars.path=/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

  1. Tried adding the jar file in the hive session, still didn't work

add jar
/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar;

I agree that both the classes are in the same package so ideally this issue
shouldn't be coming. One thing I didn't understand from your suggestion is,
why do I need to add the es-hadoop.jar in the Hadoop classpath ? I added it
to only Hive classpath as per below URL

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#_installation_3

Thanks

  • Atul

On Thursday, November 27, 2014 3:29:04 PM UTC-8, Costin Leau wrote:

Hi,

The issue is most likely caused by two different versions of es-hadoop
within your classpath, probably es-hadoop 2.0.x
(2.0.2)
and 2.1.x (2.1.0.Beta3). If they are picked up by Hive or Hadoop it means
the JVM will have two jars with classes under
the same package name.
This leads to weird conflicts as classes from jar can interact with
classes from the other jar, especially as between
2.0.x/2.1.x the code internally
went through major changes.

Make sure you have only one version of es-hadoop in your classpath - both
on the client and in the cluster. That
includes the Hive classpath, Hadoop classpath
as well as the submitting jar (since the library might be embedded).

P.S. IllegalAccesException indicates an illegal call - such as calling a
non-public class in a different class. However
in this case both classes are in the same
package and HiveUtils class is not private...

Cheers,

On 11/27/14 9:19 AM, Atul Paldhikar wrote:

Hi All,

I am using Hive 0.13.1 and trying to create an external table so data
can me loaded from Hive to Elasticsearch. However
I keep getting the following error. I have tried with following jars but
same error. I will really appreciate for any
pointers.

Thanks

  • Atul
hive.aux.jars.path <!--

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

-->

/apps/sas/elasticsearch-hadoop-2.1.0.Beta3/dist/elasticsearch-hadoop-2.1.0.Beta3.jar

A comma separated list (with no spaces) of the jar
files

ERROR :

2014-11-26 23:09:22,069 ERROR [main]: exec.DDLTask
(DDLTask.java:execute(478)) - java.lang.IllegalAccessError: tried to
access class org.elasticsearch.hadoop.hive.HiveUtils from class
org.elasticsearch.hadoop.hive.EsSerDe
at
org.elasticsearch.hadoop.hive.EsSerDe.initialize(EsSerDe.java:81)
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:339)

     at 

org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:288)

     at 

org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:281)

     at 

org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:631)

     at 

org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)

     at 

org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189)

     at 

org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)

     at 

org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

     at 

org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

     at 

org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)

     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) 
     at 

org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)

     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) 
     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) 
     at 

org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)

     at 

org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)

     at 

org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)

     at 

org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)

     at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) 
     at 

org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)

     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
     at 

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

     at 

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

     at java.lang.reflect.Method.invoke(Method.java:606) 
     at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 

2014-11-26 23:09:22,069 ERROR [main]: ql.Driver
(SessionState.java:printError(545)) - FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. tried to access
class org.elasticsearch.hadoop.hive.HiveUtils from
class org.elasticsearch.hadoop.hive.EsSerDe

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bfe7d6b1-25b4-40dc-a529-b6d68ca24f91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Costin,

Actually even that issue is resolved J There is spelling difference in the
sample available on the web, all of them have the storage class as
“EsStorageHandler” however only your GitHub post says it is
“ESStorageHandler” which is right (https://gist.github.com/costin/8025827) !
The error should have been more accurate if I am using a wrong class name.

Now the next problem, the MapReduce job is failing for some reason. I am
still a beginner in Hadoop so not exactly sure where to debug. Here are
some logs, looks like some bad character “&#” in the job.xml file. But I
that is generated by Hive right ?

Hive Log :------------------------------------------------

hive> insert overwrite table ex_address select name, st_no, st_name, city,
state, zip from employee.address;

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1417158738771_0001, Tracking URL =
http://finattr-comp-dev-01:8088/proxy/application_1417158738771_0001/

Kill Command = /apps/hadoop-2.5.1/bin/hadoop job -kill
job_1417158738771_0001

Hadoop job information for Stage-0: number of mappers: 0; number of
reducers: 0

2014-11-27 23:13:37,547 Stage-0 map = 0%, reduce = 0%

Ended Job = job_1417158738771_0001 with errors

Error during job, obtaining debugging information...

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask

MapReduce Jobs Launched:

Job 0: HDFS Read: 0 HDFS Write: 0 FAIL

Total MapReduce CPU Time Spent: 0 msec

******************** Container Job Logs *************************

Stderr:---------------------------------------------

[sas@finattr-comp-dev-01 container_1417158738771_0001_02_000001]$ cat
stderr

[Fatal Error] job.xml:606:51: Character reference "&#

log4j:WARN No appenders could be found for logger
(org.apache.hadoop.mapreduce.v2.app.MRAppMaster).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.

Syslog:---------------------------------------------

[sas@finattr-comp-dev-01 container_1417158738771_0001_02_000001]$ cat
syslog

2014-11-27 23:13:36,023 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for
application appattempt_1417158738771_0001_000002

2014-11-27 23:13:36,334 FATAL [main] org.apache.hadoop.conf.Configuration:
error parsing conf job.xml

org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml
<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;
lineNumber: 606; columnNumber: 51; Character reference "&#

    at 

com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)

    at 

com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)

    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)

    at 

org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)

    at 

org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)

    at 

org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)

    at 

org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)

    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)

    at 

org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)

    at 

org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)

2014-11-27 23:13:36,337 FATAL [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster

java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml
<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;
lineNumber: 606; columnNumber: 51; Character reference "&#

    at 

org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)

    at 

org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)

    at 

org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)

    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)

    at 

org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)

    at 

org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)

Caused by: org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml
<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;
lineNumber: 606; columnNumber: 51; Character reference "&#

    at 

com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)

    at 

com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)

    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)

    at 

org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)

    at 

org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)

    ... 5 more

2014-11-27 23:13:36,340 INFO [main] org.apache.hadoop.util.ExitUtil:
Exiting with status 1

Thanks

  • Atul

On Thursday, November 27, 2014 8:23:25 PM UTC-8, Atul Paldhikar wrote:

Hi Costin,

thanks for your response. I tried all the cleanup but still no luck :frowning:
Here are the steps I tried

  1. Removed the es-hadoop 2.1.x completed from the server

  2. Updated the hive-site.xml as below, but didn't work

hive.aux.jars.path

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar
A comma separated list (with no spaces) of the jar
files

  1. Added the jar file in the hiveconf as below, still the same issue

hive --hiveconf
hive.aux.jars.path=/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

  1. Tried adding the jar file in the hive session, still didn't work

add jar
/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar;

I agree that both the classes are in the same package so ideally this
issue shouldn't be coming. One thing I didn't understand from your
suggestion is, why do I need to add the es-hadoop.jar in the Hadoop
classpath ? I added it to only Hive classpath as per below URL

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#_installation_3

Thanks

  • Atul

On Thursday, November 27, 2014 3:29:04 PM UTC-8, Costin Leau wrote:

Hi,

The issue is most likely caused by two different versions of es-hadoop
within your classpath, probably es-hadoop 2.0.x
(2.0.2)
and 2.1.x (2.1.0.Beta3). If they are picked up by Hive or Hadoop it means
the JVM will have two jars with classes under
the same package name.
This leads to weird conflicts as classes from jar can interact with
classes from the other jar, especially as between
2.0.x/2.1.x the code internally
went through major changes.

Make sure you have only one version of es-hadoop in your classpath - both
on the client and in the cluster. That
includes the Hive classpath, Hadoop classpath
as well as the submitting jar (since the library might be embedded).

P.S. IllegalAccesException indicates an illegal call - such as calling a
non-public class in a different class. However
in this case both classes are in the same
package and HiveUtils class is not private...

Cheers,

On 11/27/14 9:19 AM, Atul Paldhikar wrote:

Hi All,

I am using Hive 0.13.1 and trying to create an external table so data
can me loaded from Hive to Elasticsearch. However
I keep getting the following error. I have tried with following jars
but same error. I will really appreciate for any
pointers.

Thanks

  • Atul
hive.aux.jars.path <!--

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

-->

/apps/sas/elasticsearch-hadoop-2.1.0.Beta3/dist/elasticsearch-hadoop-2.1.0.Beta3.jar

A comma separated list (with no spaces) of the jar
files

ERROR :

2014-11-26 23:09:22,069 ERROR [main]: exec.DDLTask
(DDLTask.java:execute(478)) - java.lang.IllegalAccessError: tried to
access class org.elasticsearch.hadoop.hive.HiveUtils from class
org.elasticsearch.hadoop.hive.EsSerDe
at
org.elasticsearch.hadoop.hive.EsSerDe.initialize(EsSerDe.java:81)
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:339)

     at 

org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:288)

     at 

org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:281)

     at 

org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:631)

     at 

org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)

     at 

org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189)

     at 

org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)

     at 

org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

     at 

org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

     at 

org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)

     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) 
     at 

org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)

     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) 
     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) 
     at 

org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)

     at 

org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)

     at 

org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)

     at 

org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)

     at 

org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)

     at 

org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)

     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
     at 

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

     at 

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

     at java.lang.reflect.Method.invoke(Method.java:606) 
     at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 

2014-11-26 23:09:22,069 ERROR [main]: ql.Driver
(SessionState.java:printError(545)) - FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. tried to access
class org.elasticsearch.hadoop.hive.HiveUtils from
class org.elasticsearch.hadoop.hive.EsSerDe

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <mailto:
elasticsearch+unsubscribe@googlegroups.com>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7376ee84-f847-40ca-bf09-c3a76c3df165%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ESStorageHandler was the name used in es-hadoop 1.3 Beta1 - in 2.0, the name was changed to EsStorageHandler.
In case you are using the wrong class, you would get a ClassNotFound Exception - assuming you don't have 1.3 in your
classpath.

Regarding the error, yes, for some reason Hive since 0.12 or 0.13 is using an illegal XML character for comments which
create an invalid
XML file. There is an issue in Hive about this but it looks to be ignored [1]
Somebody (or maybe it was you) raised an issue on es-hadoop [2] to try and address the bug in Hive (for our own tests we
try to "fix" the invalid file) since it seems to not be working anymore however it's unclear to what degree this will be
possible since it's an issue in Hive itself not es-hadoop...

Cheers,

[1] https://issues.apache.org/jira/browse/HIVE-7024
[2] https://github.com/elasticsearch/elasticsearch-hadoop/issues/322

On 11/28/14 9:53 AM, Atul Paldhikar wrote:

Hi Costin,

Actually even that issue is resolved JThere is spelling difference in the sample available on the web, all of them have
the storage class as “EsStorageHandler” however only your GitHub post says it is “ESStorageHandler” which is right
(https://gist.github.com/costin/8025827) ! The error should have been more accurate if I am using a wrong class name.

Now the next problem, the MapReduce job is failing for some reason. I am still a beginner in Hadoop so not exactly sure
where to debug. Here are some logs, looks like some bad character “&#” in the job.xml file. But I that is generated by
Hive right ?

Hive Log :------------------------------------------------

hive> insert overwrite table ex_address select name, st_no, st_name, city, state, zip from employee.address;

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1417158738771_0001, Tracking URL = http://finattr-comp-dev-01:8088/proxy/application_1417158738771_0001/

Kill Command = /apps/hadoop-2.5.1/bin/hadoop job -kill job_1417158738771_0001

Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0

2014-11-27 23:13:37,547 Stage-0 map = 0%, reduce = 0%

Ended Job = job_1417158738771_0001 with errors

Error during job, obtaining debugging information...

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

MapReduce Jobs Launched:

Job 0: HDFS Read: 0 HDFS Write: 0 FAIL

Total MapReduce CPU Time Spent: 0 msec

******************** Container Job Logs *************************

Stderr:---------------------------------------------

[sas@finattr-comp-dev-01 container_1417158738771_0001_02_000001]$ cat stderr

[Fatal Error] job.xml:606:51: Character reference "&#

log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Syslog:---------------------------------------------

[sas@finattr-comp-dev-01 container_1417158738771_0001_02_000001]$ cat syslog

2014-11-27 23:13:36,023 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application
appattempt_1417158738771_0001_000002

2014-11-27 23:13:36,334 FATAL [main] org.apache.hadoop.conf.Configuration: error parsing conf job.xml

org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml
<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;
lineNumber: 606; columnNumber: 51; Character reference "&#

     at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)

     at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)

     at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)

     at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)

     at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)

     at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)

     at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)

     at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)

     at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)

     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)

2014-11-27 23:13:36,337 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster

java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml
<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;
lineNumber: 606; columnNumber: 51; Character reference "&#

     at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)

     at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)

     at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)

     at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)

     at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)

     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)

Caused by: org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml
<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;
lineNumber: 606; columnNumber: 51; Character reference "&#

     at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)

     at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)

     at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)

     at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)

     at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)

     ... 5 more

2014-11-27 23:13:36,340 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1

Thanks

  • Atul

On Thursday, November 27, 2014 8:23:25 PM UTC-8, Atul Paldhikar wrote:

Hi Costin,

thanks for your response. I tried all the cleanup but still no luck :-( Here are the steps I tried

1. Removed the es-hadoop 2.1.x completed from the server

2. Updated the hive-site.xml as below, but didn't work

<property>
   <name>hive.aux.jars.path</name>
   <value>/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar</value>
   <description>A comma separated list (with no spaces) of the jar files</description>
</property>

3. Added the jar file in the hiveconf as below, still the same issue

hive --hiveconf hive.aux.jars.path=/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

4. Tried adding the jar file in the hive session, still didn't work

add jar /apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar;

I agree that both the classes are in the same package so ideally this issue shouldn't be coming. One thing I didn't
understand from your suggestion is, why do I need to add the es-hadoop.jar in the Hadoop classpath ? I added it to
only Hive classpath as per below URL

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#_installation_3
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#_installation_3>

Thanks
- Atul



On Thursday, November 27, 2014 3:29:04 PM UTC-8, Costin Leau wrote:

    Hi,

    The issue is most likely caused by two different versions of es-hadoop within your classpath, probably es-hadoop
    2.0.x
    (2.0.2)
    and 2.1.x (2.1.0.Beta3). If they are picked up by Hive or Hadoop it means the JVM will have two jars with
    classes under
    the same package name.
    This leads to weird conflicts as classes from jar can interact with classes from the other jar, especially as
    between
    2.0.x/2.1.x the code internally
    went through major changes.

    Make sure you have only one version of es-hadoop in your classpath - both on the client and in the cluster. That
    includes the Hive classpath, Hadoop classpath
    as well as the submitting jar (since the library might be embedded).

    P.S. IllegalAccesException indicates an illegal call - such as calling a non-public class in a different class.
    However
    in this case both classes are in the same
    package and HiveUtils class is not private...

    Cheers,

    On 11/27/14 9:19 AM, Atul Paldhikar wrote:
    > Hi All,
    >
    > I am using Hive 0.13.1 and trying to create an external table so data can me loaded from Hive to Elasticsearch. However
    > I keep getting the following error. I have tried with following jars but same error. I will really appreciate for any
    > pointers.
    >
    > Thanks
    > - Atul
    >
    > <property>
    >    <name>hive.aux.jars.path</name>
    > <!--
    >    <value>/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar</value>
    > -->
    >    <value>/apps/sas/elasticsearch-hadoop-2.1.0.Beta3/dist/elasticsearch-hadoop-2.1.0.Beta3.jar</value>
    >    <description>A comma separated list (with no spaces) of the jar files</description>
    > </property>
    >
    > ERROR :
    >
    > 2014-11-26 23:09:22,069 ERROR [main]: exec.DDLTask (DDLTask.java:execute(478)) - java.lang.IllegalAccessError: tried to
    > access class org.elasticsearch.hadoop.hive.HiveUtils from class org.elasticsearch.hadoop.hive.EsSerDe
    >          at org.elasticsearch.hadoop.hive.EsSerDe.initialize(EsSerDe.java:81)
    >          at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:339)
    >          at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:288)
    >          at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:281)
    >          at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:631)
    >          at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)
    >          at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189)
    >          at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
    >          at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
    >          at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
    >          at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
    >          at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
    >          at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
    >          at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
    >          at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
    >          at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
    >          at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
    >          at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
    >          at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
    >          at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
    >          at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
    >          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    >          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    >          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >          at java.lang.reflect.Method.invoke(Method.java:606)
    >          at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    >
    > 2014-11-26 23:09:22,069 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: Execution Error, return
    > code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. tried to access class org.elasticsearch.hadoop.hive.HiveUtils from
    > class org.elasticsearch.hadoop.hive.EsSerDe
    >
    > --
    > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
    > To unsubscribe from this group and stop receiving emails from it, send an email to
    >elasticsearc...@googlegroups.com <mailto:elasticsearch+unsubscribe@googlegroups.com>.
    > To view this discussion on the web visit
    >https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com>
    > <https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com?utm_medium=email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com?utm_medium=email&utm_source=footer>>.

    > For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

    --
    Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7376ee84-f847-40ca-bf09-c3a76c3df165%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7376ee84-f847-40ca-bf09-c3a76c3df165%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/547865B9.4010005%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Finally I was able to load the data from Hive to Elasticsearch !!! Yes you
are right, in the beginning I started with es-hadoop 1..3.x then replaced
by 2.0.2. However somewhere in the Hive classpath it remained and caused
all this trouble.

Now I do have 2 "ex_address" tables. One in "Default" and other in
"Employee" database. I want to "Drop" the one from "Default" database but
it won't let me without es-hadoop 1.3.x in path ! I think I will just leave
it there.

Thanks for your help.

  • Atul

On Friday, November 28, 2014 4:08:55 AM UTC-8, Costin Leau wrote:

ESStorageHandler was the name used in es-hadoop 1.3 Beta1 - in 2.0, the
name was changed to EsStorageHandler.
In case you are using the wrong class, you would get a ClassNotFound
Exception - assuming you don't have 1.3 in your
classpath.

Regarding the error, yes, for some reason Hive since 0.12 or 0.13 is using
an illegal XML character for comments which
create an invalid
XML file. There is an issue in Hive about this but it looks to be ignored
[1]
Somebody (or maybe it was you) raised an issue on es-hadoop [2] to try and
address the bug in Hive (for our own tests we
try to "fix" the invalid file) since it seems to not be working anymore
however it's unclear to what degree this will be
possible since it's an issue in Hive itself not es-hadoop...

Cheers,

[1] https://issues.apache.org/jira/browse/HIVE-7024
[2] https://github.com/elasticsearch/elasticsearch-hadoop/issues/322

On 11/28/14 9:53 AM, Atul Paldhikar wrote:

Hi Costin,

Actually even that issue is resolved JThere is spelling difference in
the sample available on the web, all of them have
the storage class as “EsStorageHandler” however only your GitHub post
says it is “ESStorageHandler” which is right
(https://gist.github.com/costin/8025827) ! The error should have been
more accurate if I am using a wrong class name.

Now the next problem, the MapReduce job is failing for some reason. I am
still a beginner in Hadoop so not exactly sure
where to debug. Here are some logs, looks like some bad character “&#”
in the job.xml file. But I that is generated by
Hive right ?

Hive Log :------------------------------------------------

hive> insert overwrite table ex_address select name, st_no, st_name,
city, state, zip from employee.address;

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1417158738771_0001, Tracking URL =
http://finattr-comp-dev-01:8088/proxy/application_1417158738771_0001/

Kill Command = /apps/hadoop-2.5.1/bin/hadoop job -kill
job_1417158738771_0001

Hadoop job information for Stage-0: number of mappers: 0; number of
reducers: 0

2014-11-27 23:13:37,547 Stage-0 map = 0%, reduce = 0%

Ended Job = job_1417158738771_0001 with errors

Error during job, obtaining debugging information...

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask

MapReduce Jobs Launched:

Job 0: HDFS Read: 0 HDFS Write: 0 FAIL

Total MapReduce CPU Time Spent: 0 msec

******************** Container Job Logs *************************

Stderr:---------------------------------------------

[sas@finattr-comp-dev-01 container_1417158738771_0001_02_000001]$ cat
stderr

[Fatal Error] job.xml:606:51: Character reference "&#

log4j:WARN No appenders could be found for logger
(org.apache.hadoop.mapreduce.v2.app.MRAppMaster).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig
for more info.

Syslog:---------------------------------------------

[sas@finattr-comp-dev-01 container_1417158738771_0001_02_000001]$ cat
syslog

2014-11-27 23:13:36,023 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for
application
appattempt_1417158738771_0001_000002

2014-11-27 23:13:36,334 FATAL [main]
org.apache.hadoop.conf.Configuration: error parsing conf job.xml

org.xml.sax.SAXParseException; systemId:

file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml

<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;

lineNumber: 606; columnNumber: 51; Character reference "&#

     at 

com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)

     at 

com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)

     at 

javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)

     at 

org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)

     at 

org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)

     at 

org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)

     at 

org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)

     at 

org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)

     at 

org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)

     at 

org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)

2014-11-27 23:13:36,337 FATAL [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster

java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId:

file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml

<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;

lineNumber: 606; columnNumber: 51; Character reference "&#

     at 

org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)

     at 

org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)

     at 

org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)

     at 

org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)

     at 

org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)

     at 

org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)

Caused by: org.xml.sax.SAXParseException; systemId:

file:///tmp/hadoop-sas/nm-local-dir/usercache/sas/appcache/application_1417158738771_0001/container_1417158738771_0001_02_000001/job.xml

<file:///\tmp\hadoop-sas\nm-local-dir\usercache\sas\appcache\application_1417158738771_0001\container_1417158738771_0001_02_000001\job.xml>;

lineNumber: 606; columnNumber: 51; Character reference "&#

     at 

com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)

     at 

com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)

     at 

javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)

     at 

org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)

     at 

org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)

     ... 5 more 

2014-11-27 23:13:36,340 INFO [main] org.apache.hadoop.util.ExitUtil:
Exiting with status 1

Thanks

  • Atul

On Thursday, November 27, 2014 8:23:25 PM UTC-8, Atul Paldhikar wrote:

Hi Costin, 

thanks for your response. I tried all the cleanup but still no luck 

:frowning: Here are the steps I tried

1. Removed the es-hadoop 2.1.x completed from the server 

2. Updated the hive-site.xml as below, but didn't work 

<property> 
   <name>hive.aux.jars.path</name> 

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

   <description>A comma separated list (with no spaces) of the jar 

files

</property> 

3. Added the jar file in the hiveconf as below, still the same issue 

hive --hiveconf 

hive.aux.jars.path=/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

4. Tried adding the jar file in the hive session, still didn't work 

add jar 

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar;

I agree that both the classes are in the same package so ideally 

this issue shouldn't be coming. One thing I didn't

understand from your suggestion is, why do I need to add the 

es-hadoop.jar in the Hadoop classpath ? I added it to

only Hive classpath as per below URL 

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#_installation_3

<

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#_installation_3>

Thanks 
- Atul 



On Thursday, November 27, 2014 3:29:04 PM UTC-8, Costin Leau wrote: 

    Hi, 

    The issue is most likely caused by two different versions of 

es-hadoop within your classpath, probably es-hadoop

    2.0.x 
    (2.0.2) 
    and 2.1.x (2.1.0.Beta3). If they are picked up by Hive or Hadoop 

it means the JVM will have two jars with

    classes under 
    the same package name. 
    This leads to weird conflicts as classes from jar can interact 

with classes from the other jar, especially as

    between 
    2.0.x/2.1.x the code internally 
    went through major changes. 

    Make sure you have only one version of es-hadoop in your 

classpath - both on the client and in the cluster. That

    includes the Hive classpath, Hadoop classpath 
    as well as the submitting jar (since the library might be 

embedded).

    P.S. IllegalAccesException indicates an illegal call - such as 

calling a non-public class in a different class.

    However 
    in this case both classes are in the same 
    package and HiveUtils class is not private... 

    Cheers, 

    On 11/27/14 9:19 AM, Atul Paldhikar wrote: 
    > Hi All, 
    > 
    > I am using Hive 0.13.1 and trying to create an external table 

so data can me loaded from Hive to Elasticsearch. However

    > I keep getting the following error. I have tried with 

following jars but same error. I will really appreciate for any

    > pointers. 
    > 
    > Thanks 
    > - Atul 
    > 
    > <property> 
    >    <name>hive.aux.jars.path</name> 
    > <!-- 
    >   

/apps/sas/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-2.0.2.jar

    > --> 
    >   

/apps/sas/elasticsearch-hadoop-2.1.0.Beta3/dist/elasticsearch-hadoop-2.1.0.Beta3.jar

    >    <description>A comma separated list (with no spaces) of the 

jar files

    > </property> 
    > 
    > ERROR : 
    > 
    > 2014-11-26 23:09:22,069 ERROR [main]: exec.DDLTask 

(DDLTask.java:execute(478)) - java.lang.IllegalAccessError: tried to

    > access class org.elasticsearch.hadoop.hive.HiveUtils from 

class org.elasticsearch.hadoop.hive.EsSerDe

    >          at 

org.elasticsearch.hadoop.hive.EsSerDe.initialize(EsSerDe.java:81)

    >          at 

org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:339)

    >          at 

org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:288)

    >          at 

org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:281)

    >          at 

org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:631)

    >          at 

org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)

    >          at 

org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189)

    >          at 

org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)

    >          at 

org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

    >          at 

org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

    >          at 

org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)

    >          at 

org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)

    >          at 

org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)

    >          at 

org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)

    >          at 

org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)

    >          at 

org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)

    >          at 

org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)

    >          at 

org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)

    >          at 

org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)

    >          at 

org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)

    >          at 

org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)

    >          at 

sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    >          at 

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

    >          at 

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    >          at java.lang.reflect.Method.invoke(Method.java:606) 
    >          at 

org.apache.hadoop.util.RunJar.main(RunJar.java:212)

    > 
    > 2014-11-26 23:09:22,069 ERROR [main]: ql.Driver 

(SessionState.java:printError(545)) - FAILED: Execution Error, return

    > code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. tried to 

access class org.elasticsearch.hadoop.hive.HiveUtils from

    > class org.elasticsearch.hadoop.hive.EsSerDe 
    > 
    > -- 
    > You received this message because you are subscribed to the 

Google Groups "elasticsearch" group.

    > To unsubscribe from this group and stop receiving emails from 

it, send an email to

    >elasticsearc...@googlegroups.com <mailto:

elasticsearch+unsubscribe@googlegroups.com <javascript:>>.

    > To view this discussion on the web visit 
    >

https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com

    <

https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com>

    > <

https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com?utm_medium=email&utm_source=footer

    <

https://groups.google.com/d/msgid/elasticsearch/78b85fb6-6eea-46e8-964a-d96e324e780d%40googlegroups.com?utm_medium=email&utm_source=footer>>.

    > For more options, visithttps://groups.google.com/d/optout <

https://groups.google.com/d/optout>.

    -- 
    Costin 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/7376ee84-f847-40ca-bf09-c3a76c3df165%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/7376ee84-f847-40ca-bf09-c3a76c3df165%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3c9cb6b2-2d97-4c8e-98bd-56efe3c117d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.