Elasticsearch Hadoop WRITE operation not using Reducer

Hi All,

Will Elasticsearch Hadoop WRITE operation doesn't use our custom reducer?
I tried with following code and observed that our customer reducer is not
invoked.
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(BytesWritable.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);

            configuration.set("es.nodes", "master:9200");

configuration.set("es.resource.write","{indexName}/{indexType}");
configuration.set("es.input.json", "yes");
//configuration.set("es.write.operation", "upsert");

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6607380d-48f1-4ba3-9b06-06de0ab0841c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

es-hadoop does not use either mapper or reducers; the Map/Reduce integration relies on the Input/OutputFormat which can
be invoked either from a Mapper
or from a Reducer.

Your Reducer might not be invoked for a variety of reasons; typically the map and reduce phases have different output
types and the job fails silently
after invoking context.write method. In fact, you can just remove the EsOutputFormat and see whether it makes any
difference (it shouldn't).

On 11/1/14 7:20 AM, Sarath wrote:

Hi All,

Will Elasticsearch Hadoop WRITE operation doesn't use our custom reducer? I tried with following code and observed
that our customer reducer is not invoked.
|
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(BytesWritable.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);

             configuration.set("es.nodes", "master:9200");

configuration.set("es.resource.write","{indexName}/{indexType}");
configuration.set("es.input.json", "yes");
//configuration.set("es.write.operation", "upsert");
|

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6607380d-48f1-4ba3-9b06-06de0ab0841c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6607380d-48f1-4ba3-9b06-06de0ab0841c%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54575ED6.2010302%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Costin,

Thanks for the response. You are right. Map/Reduce integration relies on
the Input/OutputFormat. Even after removing EsOutputFormat my custom
reducer is not invoked. Should be some issue with hadoop configuration.

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7f0b88d0-fe76-41c2-8c51-398308e913bf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

Doesn't look like you've set the number of reduce tasks in your job config.

i.e.
'job.setNumReduceTasks(10);'

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1c469ecf-cec7-4d07-840d-d5b91a7a40ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Telax,

Even though i don't set number of reduce tasks, Hadoop takes care of
starting reducers if needed. In my case 1 reducer is running. Issue here
custom reducer defined as part of Job configuration is not invoked by Hadoop

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/42616bab-b062-4d61-977a-f845265a94d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.