Elasticsearch Hadoop WRITE operation not using Reducer

Sarath · November 1, 2014, 5:20am

Hi All,

Will Elasticsearch Hadoop WRITE operation doesn't use our custom reducer?
I tried with following code and observed that our customer reducer is not
invoked.
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(BytesWritable.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);

            configuration.set("es.nodes", "master:9200");

configuration.set("es.resource.write","{indexName}/{indexType}");
configuration.set("es.input.json", "yes");
//configuration.set("es.write.operation", "upsert");

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6607380d-48f1-4ba3-9b06-06de0ab0841c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

costin · November 3, 2014, 10:54am

Hi,

es-hadoop does not use either mapper or reducers; the Map/Reduce integration relies on the Input/OutputFormat which can
be invoked either from a Mapper
or from a Reducer.

Your Reducer might not be invoked for a variety of reasons; typically the map and reduce phases have different output
types and the job fails silently
after invoking context.write method. In fact, you can just remove the EsOutputFormat and see whether it makes any
difference (it shouldn't).

On 11/1/14 7:20 AM, Sarath wrote:

Hi All,

Will Elasticsearch Hadoop WRITE operation doesn't use our custom reducer? I tried with following code and observed
that our customer reducer is not invoked.
|
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(BytesWritable.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
             configuration.set("es.nodes", "master:9200");
configuration.set("es.resource.write","{indexName}/{indexType}");
configuration.set("es.input.json", "yes");
//configuration.set("es.write.operation", "upsert");
|

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6607380d-48f1-4ba3-9b06-06de0ab0841c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6607380d-48f1-4ba3-9b06-06de0ab0841c%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54575ED6.2010302%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Sarath · November 3, 2014, 11:41am

Hi Costin,

Thanks for the response. You are right. Map/Reduce integration relies on
the Input/OutputFormat. Even after removing EsOutputFormat my custom
reducer is not invoked. Should be some issue with hadoop configuration.

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7f0b88d0-fe76-41c2-8c51-398308e913bf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Telax · November 4, 2014, 7:46am

Hi,

Doesn't look like you've set the number of reduce tasks in your job config.

i.e.
'job.setNumReduceTasks(10);'

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1c469ecf-cec7-4d07-840d-d5b91a7a40ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sarath · November 4, 2014, 7:50am

Hi Telax,

Even though i don't set number of reduce tasks, Hadoop takes care of
starting reducers if needed. In my case 1 reducer is running. Issue here
custom reducer defined as part of Job configuration is not invoked by Hadoop

Thanks,
Sarath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/42616bab-b062-4d61-977a-f845265a94d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Writing to Elasticsearch from HDFS using Map/Reduce Elasticsearch es-hadoop	2	1597	July 6, 2017
Need help for Hadoop and ES integration Elasticsearch	6	445	July 6, 2017
Questioins about setting up eshadoop Elasticsearch	4	1208	July 6, 2017
[ElasticSearch Hadoop] Error in configuring object Elasticsearch	5	2237	July 6, 2017
Unable to write parent and child documents from same map-reduce job Elasticsearch	1	385	July 6, 2017

Elasticsearch Hadoop WRITE operation not using Reducer

Related topics