[hadoop] java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/mr/EsOutputFormat

Hello, I’m using elasticsearch-hadoop-2.0.2.jar, and meet the problem:

Exception in thread "main" java.lang.NoClassDefFoundError:
org/elasticsearch/hadoop/mr/EsOutputFormat

at com.clqb.app.ElasticSearch.run(ElasticSearch.java:46)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at com.clqb.app.ElasticSearch.main(ElasticSearch.java:60)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.hadoop.mr.EsOutputFormat

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

... 9 more

Here’s my program:

public class ElasticSearch extends Configured implements Tool {

public static class AwesomeMapper extends Mapper<LongWritable, Text, 

NullWritable, MapWritable> {

    @Override

    protected void map(LongWritable key, Text value, Context 

context) throws IOException, InterruptedException {

        context.write(NullWritable.get(), 

XmlUtils.xmlTextToMapWritable(value)); // XmlUtils is not shown here

    }

}


public static class AwesomeReducer extends Reducer<NullWritable, 

MapWritable, NullWritable, NullWritable> {

}


public int run(String[] args) throws Exception {

    Configuration conf = getConf();

    conf.set("xmlinput.start", "<page>");

    conf.set("xmlinput.end", "</page>");

    conf.setBoolean("mapred.map.tasks.speculative.execution", false);

    conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);

    conf.set("es.nodes", "localhost:9200");

    conf.set("es.resource", "radio/artists");


    Job job = Job.getInstance(conf);

    job.setJarByClass(ElasticSearch.class);

    job.setInputFormatClass(XmlInputFormat.class);

    job.setOutputFormatClass(EsOutputFormat.class);

    job.setMapOutputValueClass(MapWritable.class);

    job.setMapperClass(AwesomeMapper.class);

    job.setReducerClass(AwesomeReducer.class);


    Path outputPath = new Path(args[1]);

    FileInputFormat.setInputPaths(job, new Path(args[0]));

    FileOutputFormat.setOutputPath(job, outputPath);

    outputPath.getFileSystem(conf).delete(outputPath, true);


    return job.waitForCompletion(true) ? 0 : 1;

}


public static void main(String[] args) throws Exception {

    int exitCode = ToolRunner.run(new ElasticSearch(), args);

    System.exit(exitCode);

}

}

p.s. I also make sure that I have included
elasticsearch-hadoop-2.0.2.jar in my -libjars
. Any suggestions?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/762794c8-0bd0-4c16-b1dd-9c914a29a710%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

It looks like es-hadoop is not part of your classpath (hence the NCDFE). This might be either due to some
misconfiguration of your classpath or due to the way
the Configuration object is used. It looks like you are using it correctly though typically I use Job(Configuration)
instead of getInstance() (static factory methods
are always bad).
Potentially you can use other variants like LIBJARS or HADOOP_CLASSPATH env variables and make notice of what type of
separator you use between your jars (, vs : vs ;).

Try to debug the classpath and see what you get - see the jar that is created and uploaded to HDFS, turn on logging on
the hadoop side - potentially use distributed cache
Embedding the libraries under lib/ also works (see [2]).

All of the have pros and cons, the idea is to get your sample running and then debug your env to see what's the issue.

Cheers,

[1] MapReduce Tutorial
[2] How-to: Include Third-Party Libraries in Your MapReduce Job - Cloudera Blog

On 12/14/14 3:32 PM, CAI Longqi wrote:

Hello, I’m using elasticsearch-hadoop-2.0.2.jar, and meet the problem:

Exception in thread "main" java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/mr/EsOutputFormat

at com.clqb.app.Elasticsearch.run(Elasticsearch.java:46)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at com.clqb.app.Elasticsearch.main(Elasticsearch.java:60)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Caused by: java.lang.ClassNotFoundException: org.elasticsearch.hadoop.mr.EsOutputFormat

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

... 9 more

Here’s my program:

public class Elasticsearch extends Configured implements Tool {

 public static class AwesomeMapper extends Mapper<LongWritable, Text, NullWritable, MapWritable> {

     @Override

     protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

         context.write(NullWritable.get(), XmlUtils.xmlTextToMapWritable(value)); // XmlUtils is not shown here

     }

 }


 public static class AwesomeReducer extends Reducer<NullWritable, MapWritable, NullWritable, NullWritable> {

 }


 public int run(String[] args) throws Exception {

     Configuration conf = getConf();

     conf.set("xmlinput.start", "<page>");

     conf.set("xmlinput.end", "</page>");

     conf.setBoolean("mapred.map.tasks.speculative.execution", false);

     conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);

     conf.set("es.nodes", "localhost:9200");

     conf.set("es.resource", "radio/artists");


     Job job = Job.getInstance(conf);

     job.setJarByClass(ElasticSearch.class);

     job.setInputFormatClass(XmlInputFormat.class);

     job.setOutputFormatClass(EsOutputFormat.class);

     job.setMapOutputValueClass(MapWritable.class);

     job.setMapperClass(AwesomeMapper.class);

     job.setReducerClass(AwesomeReducer.class);


     Path outputPath = new Path(args[1]);

     FileInputFormat.setInputPaths(job, new Path(args[0]));

     FileOutputFormat.setOutputPath(job, outputPath);

     outputPath.getFileSystem(conf).delete(outputPath, true);


     return job.waitForCompletion(true) ? 0 : 1;

 }


 public static void main(String[] args) throws Exception {

     int exitCode = ToolRunner.run(new ElasticSearch(), args);

     System.exit(exitCode);

 }

}

p.s. I also make sure that I have included elasticsearch-hadoop-2.0.2.jar in my -libjars. Any suggestions?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/762794c8-0bd0-4c16-b1dd-9c914a29a710%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/762794c8-0bd0-4c16-b1dd-9c914a29a710%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/548DEE3C.3080609%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks, I managed to fix this issue by export HADOOP_CLASSPATH=/path/to/my/elasticsearch-hadoop-2.0.2.jar. Don't know
why, but it works. I have already configured that using -libjars; I don't
know why hadoop needs me to specify that again using that global variable.

Another question: how do I debug the classpath and see if the jar is
created and uploaded? (I'm using hadoop-2.5.1 and yarn, not 0.X.X, so I
don't have job tracker and
${mapred.local.dir}/taskTracker/${user.name}/jobcache/$jobid/jars )

BTW I've already gone through that very basic word count example, thanks
anyway.

在 2014年12月15日星期一UTC+8上午4时09分00秒,Costin Leau写道:

Hi,

It looks like es-hadoop is not part of your classpath (hence the NCDFE).
This might be either due to some
misconfiguration of your classpath or due to the way
the Configuration object is used. It looks like you are using it correctly
though typically I use Job(Configuration)
instead of getInstance() (static factory methods
are always bad).
Potentially you can use other variants like LIBJARS or HADOOP_CLASSPATH
env variables and make notice of what type of
separator you use between your jars (, vs : vs ;).

Try to debug the classpath and see what you get - see the jar that is
created and uploaded to HDFS, turn on logging on
the hadoop side - potentially use distributed cache
Embedding the libraries under lib/ also works (see [2]).

All of the have pros and cons, the idea is to get your sample running and
then debug your env to see what's the issue.

Cheers,

[1]
MapReduce Tutorial
[2]
How-to: Include Third-Party Libraries in Your MapReduce Job - Cloudera Blog

On 12/14/14 3:32 PM, CAI Longqi wrote:

Hello, I’m using elasticsearch-hadoop-2.0.2.jar, and meet the problem:

Exception in thread "main" java.lang.NoClassDefFoundError:
org/elasticsearch/hadoop/mr/EsOutputFormat

at com.clqb.app.Elasticsearch.run(Elasticsearch.java:46)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at com.clqb.app.Elasticsearch.main(Elasticsearch.java:60)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.hadoop.mr.EsOutputFormat

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

... 9 more

Here’s my program:

public class Elasticsearch extends Configured implements Tool {

 public static class AwesomeMapper extends Mapper<LongWritable, 

Text, NullWritable, MapWritable> {

     @Override 

     protected void map(LongWritable key, Text value, Context 

context) throws IOException, InterruptedException {

         context.write(NullWritable.get(), 

XmlUtils.xmlTextToMapWritable(value)); // XmlUtils is not shown here

     } 

 } 


 public static class AwesomeReducer extends Reducer<NullWritable, 

MapWritable, NullWritable, NullWritable> {

 } 


 public int run(String[] args) throws Exception { 

     Configuration conf = getConf(); 

     conf.set("xmlinput.start", "<page>"); 

     conf.set("xmlinput.end", "</page>"); 

     conf.setBoolean("mapred.map.tasks.speculative.execution", 

false);

     conf.setBoolean("mapred.reduce.tasks.speculative.execution", 

false);

     conf.set("es.nodes", "localhost:9200"); 

     conf.set("es.resource", "radio/artists"); 


     Job job = Job.getInstance(conf); 

     job.setJarByClass(ElasticSearch.class); 

     job.setInputFormatClass(XmlInputFormat.class); 

     job.setOutputFormatClass(EsOutputFormat.class); 

     job.setMapOutputValueClass(MapWritable.class); 

     job.setMapperClass(AwesomeMapper.class); 

     job.setReducerClass(AwesomeReducer.class); 


     Path outputPath = new Path(args[1]); 

     FileInputFormat.setInputPaths(job, new Path(args[0])); 

     FileOutputFormat.setOutputPath(job, outputPath); 

     outputPath.getFileSystem(conf).delete(outputPath, true); 


     return job.waitForCompletion(true) ? 0 : 1; 

 } 


 public static void main(String[] args) throws Exception { 

     int exitCode = ToolRunner.run(new ElasticSearch(), args); 

     System.exit(exitCode); 

 } 

}

p.s. I also make sure that I have included
elasticsearch-hadoop-2.0.2.jar in my -libjars
. Any suggestions?

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/762794c8-0bd0-4c16-b1dd-9c914a29a710%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/762794c8-0bd0-4c16-b1dd-9c914a29a710%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5c5e6ed-db69-4b2b-bdc9-621273774bec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.