ElasticSearch and Hadoop Error


(david fauth) #1

I am using MortarData and wanted to try out the new ElasticSearch with Pig.
The following is the Pig code:

REGISTER '/Users/davidfauth/Downloads/elasticsearch-hadoop-1.3.0.M1.jar';

nucc_codes = LOAD '/Users/davidfauth/Downloads/nucc_taxonomy_130.txt' USING
PigStorage('\t') AS
(nuccCode:chararray,
nuccType:chararray,
nuccClassification:chararray,
nuccSpecialty:chararray);

-- ETL the data to your heart content
B = FOREACH nucc_codes GENERATE nuccCode, TOTUPLE(nuccType,nuccClassification
) AS nuccData;
-- save the result to Elasticsearch
STORE B INTO '/Users/davidfauth/MortarDataOut/es' USING
org.elasticsearch.hadoop.pig.ESStorage();

When I run this, it logs the following error:

ERROR 2013-10-07 21:22:42,398 [main] org.apache.pig.tools.grunt.Grunt:
ERROR 2998: Unhandled internal error.
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
Pig Stack Trace

ERROR 2998: Unhandled internal error.
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B

java.lang.NoSuchMethodError:
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
at org.elasticsearch.hadoop.util.IOUtils.serializeToBase64(IOUtils.java:38)
at org.elasticsearch.hadoop.pig.ESStorage.checkSchema(ESStorage.java:113)
at
org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:65)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at
org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:300)
at org.apache.pig.PigServer.compilePp(PigServer.java:1315)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1252)
at org.apache.pig.PigServer.execute(PigServer.java:1241)
at org.apache.pig.PigServer.executeBatch(PigServer.java:358)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:134)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:195)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:167)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:433)
at org.apache.pig.PigRunner.run(PigRunner.java:49)
at com.mortardata.hawk.HawkMain.runHawkPig(HawkMain.java:26)
at com.mortardata.hawk.HawkMain.main(HawkMain.java:85)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Costin Leau) #2

Hi,

I haven't used MortarData with ES-Hadoop. From the looks of it, the
Hadoop/Pig dependency jars which ES-Hadoop uses internally are not present
in the classpath. What version of Apache Pig MortarData uses ? can you
share your classpath?

cheers,

On Tue, Oct 8, 2013 at 4:24 AM, david fauth dsfauth@gmail.com wrote:

I am using MortarData and wanted to try out the new ElasticSearch with
Pig. The following is the Pig code:

REGISTER '/Users/davidfauth/Downloads/elasticsearch-hadoop-1.3.0.M1.jar';

nucc_codes = LOAD '/Users/davidfauth/Downloads/nucc_taxonomy_130.txt'
USING PigStorage('\t') AS
(nuccCode:chararray,
nuccType:chararray,
nuccClassification:chararray,
nuccSpecialty:chararray);

-- ETL the data to your heart content
B = FOREACH nucc_codes GENERATE nuccCode, TOTUPLE(nuccType,nuccClassification
) AS nuccData;
-- save the result to Elasticsearch
STORE B INTO '/Users/davidfauth/MortarDataOut/es' USING
org.elasticsearch.hadoop.pig.ESStorage();

When I run this, it logs the following error:

ERROR 2013-10-07 21:22:42,398 [main] org.apache.pig.tools.grunt.Grunt:
ERROR 2998: Unhandled internal error.
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
Pig Stack Trace

ERROR 2998: Unhandled internal error.
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B

java.lang.NoSuchMethodError:
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
at org.elasticsearch.hadoop.util.IOUtils.serializeToBase64(IOUtils.java:38)
at org.elasticsearch.hadoop.pig.ESStorage.checkSchema(ESStorage.java:113)
at
org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:65)
at
org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at
org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:300)
at org.apache.pig.PigServer.compilePp(PigServer.java:1315)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1252)
at org.apache.pig.PigServer.execute(PigServer.java:1241)
at org.apache.pig.PigServer.executeBatch(PigServer.java:358)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:134)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:195)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:167)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:433)
at org.apache.pig.PigRunner.run(PigRunner.java:49)
at com.mortardata.hawk.HawkMain.runHawkPig(HawkMain.java:26)
at com.mortardata.hawk.HawkMain.main(HawkMain.java:85)

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Costin Leau) #3

To clarify - the jar (commons codec) is present but it seems to be an old
version.

On Tue, Oct 8, 2013 at 10:13 AM, Costin Leau costin.leau@gmail.com wrote:

Hi,

I haven't used MortarData with ES-Hadoop. From the looks of it, the
Hadoop/Pig dependency jars which ES-Hadoop uses internally are not present
in the classpath. What version of Apache Pig MortarData uses ? can you
share your classpath?

cheers,

On Tue, Oct 8, 2013 at 4:24 AM, david fauth dsfauth@gmail.com wrote:

I am using MortarData and wanted to try out the new ElasticSearch with
Pig. The following is the Pig code:

REGISTER '/Users/davidfauth/Downloads/elasticsearch-hadoop-1.3.0.M1.jar';

nucc_codes = LOAD '/Users/davidfauth/Downloads/nucc_taxonomy_130.txt'
USING PigStorage('\t') AS
(nuccCode:chararray,
nuccType:chararray,
nuccClassification:chararray,
nuccSpecialty:chararray);

-- ETL the data to your heart content
B = FOREACH nucc_codes GENERATE nuccCode, TOTUPLE(nuccType,nuccClassification
) AS nuccData;
-- save the result to Elasticsearch
STORE B INTO '/Users/davidfauth/MortarDataOut/es' USING
org.elasticsearch.hadoop.pig.ESStorage();

When I run this, it logs the following error:

ERROR 2013-10-07 21:22:42,398 [main] org.apache.pig.tools.grunt.Grunt:
ERROR 2998: Unhandled internal error.
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
Pig Stack Trace

ERROR 2998: Unhandled internal error.
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B

java.lang.NoSuchMethodError:
org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
at
org.elasticsearch.hadoop.util.IOUtils.serializeToBase64(IOUtils.java:38)
at org.elasticsearch.hadoop.pig.ESStorage.checkSchema(ESStorage.java:113)
at
org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:65)
at
org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at
org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at
org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:300)
at org.apache.pig.PigServer.compilePp(PigServer.java:1315)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1252)
at org.apache.pig.PigServer.execute(PigServer.java:1241)
at org.apache.pig.PigServer.executeBatch(PigServer.java:358)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:134)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:195)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:167)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:433)
at org.apache.pig.PigRunner.run(PigRunner.java:49)
at com.mortardata.hawk.HawkMain.runHawkPig(HawkMain.java:26)
at com.mortardata.hawk.HawkMain.main(HawkMain.java:85)

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4