I'm trying to index some data using elasticsearch-mapreduce 2.3.1, my input are Avro files that I then convert to JSON documents and I'm getting this error:
Error: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found unrecoverable error [130.226.238.240:9200] returned Bad Request(400) - failed to parse;mapper [values] of different type, current_type [long], merged_type [string]; Bailing out..
at org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:207)
at org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:170)
at org.elasticsearch.hadoop.rest.RestRepository.tryFlush(RestRepository.java:225)
at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:248)
at org.elasticsearch.hadoop.rest.RestRepository.close(RestRepository.java:267)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.doClose(EsOutputFormat.java:214)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.close(EsOutputFormat.java:196)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
The important pieces of my code are:
... implements Tool {
Configuration conf = getConf();
conf.setBoolean(MRJobConfig.MAP_SPECULATIVE,false);
conf.set("es.nodes", args[1]);
conf.set("es.resource", args[2]);
conf.set("es.input.json", "yes");
conf.set(AvroJob.INPUT_SCHEMA, Occurrence.getClassSchema().toString());
conf.setBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, true);
conf.setBoolean(MRJobConfig.MAPREDUCE_TASK_CLASSPATH_PRECEDENCE, true);
Job job = Job.getInstance(conf,"occurrence-es-indexing");
job.setUserClassesTakesPrecedence(true);
job.setInputFormatClass(AvroKeyInputFormat.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(BytesWritable.class);
job.setMapperClass(OccurrenceAvroMapper.class);
job.setOutputFormatClass(EsOutputFormat.class); ...}
...extends Mapper .. {
@Override
public void map(AvroKey occurrenceAvro, NullWritable value, Context context) throws IOException, InterruptedException {
context.write(NullWritable.get(),new BytesWritable(GSON.toJson(occurrenceAvro.datum()).getBytes()));
}
}