Will Elasticsearch Hadoop WRITE operation doesn't use our custom reducer?
I tried with following code and observed that our customer reducer is not
invoked.
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(BytesWritable.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
es-hadoop does not use either mapper or reducers; the Map/Reduce integration relies on the Input/OutputFormat which can
be invoked either from a Mapper
or from a Reducer.
Your Reducer might not be invoked for a variety of reasons; typically the map and reduce phases have different output
types and the job fails silently
after invoking context.write method. In fact, you can just remove the EsOutputFormat and see whether it makes any
difference (it shouldn't).
On 11/1/14 7:20 AM, Sarath wrote:
Hi All,
Will Elasticsearch Hadoop WRITE operation doesn't use our custom reducer? I tried with following code and observed
that our customer reducer is not invoked.
|
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(BytesWritable.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
Thanks for the response. You are right. Map/Reduce integration relies on
the Input/OutputFormat. Even after removing EsOutputFormat my custom
reducer is not invoked. Should be some issue with hadoop configuration.
Even though i don't set number of reduce tasks, Hadoop takes care of
starting reducers if needed. In my case 1 reducer is running. Issue here
custom reducer defined as part of Job configuration is not invoked by Hadoop
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.