Hi,
Looked again at your code sample and your configuration is incorrect. For some reason you are using
FileInput/OuputFormat to set the input and output; since you are using
es-hadoop you need to specify only the input and not the output. Moreover in your case, you are not using the input so
potentially you can remove that as well.
Did you set the es.resource for es-hadoop? I don't see that set anywhere though, since no exceptions was raised you
probably configured it somewhere.
I've tried replicating the problem but I can't - the writable are properly converted into JSON. Can you please enable
logging [1] and report back? Additionally make sure
you are using the latest build since the error message is different and should give you more information (what field is
being extracted and from where)...
Cheers,
[1] Elasticsearch Platform — Find real-time answers at scale | Elastic
On 7/15/14 7:19 PM, Aurélien V wrote:
Hi Costin,
Thanks for support. Well, I'm still experiencing this issue, and for now I see no obvious reasons for it. My only guess
is about environment stuff and I'm trying to clean maven dependencies, environment variables, test version
compatibility. For the moment nothing had worked.
About the constant, it was to test to ensure my data wasn't corrupted in some way. So I'm pretty sure the exception
gives no clue about the real issue.
I keep you in touch in case of I discover the reason, may interest somone after all.
Aurelien
2014-07-14 18:48 GMT+03:00 Costin Leau <costin.leau@gmail.com mailto:costin.leau@gmail.com>:
Hi,
Nothing jumps out from your configuration. The error indicates that the values passed to es-hadoop cannot be
processed for some reason. Which is more surpsing considering your Mapper writes some constants to the output.
I've pushed some improvements to the 2.x branch which explain better conditions in which the error appears - you can
either build the jar yourself [1] and test it out or wait for the nightly build to publish the artifact [2].
Cheers,
[1] https://github.com/__elasticsearch/elasticsearch-__hadoop/tree/2.x
<https://github.com/elasticsearch/elasticsearch-hadoop/tree/2.x>
[2] http://build.elasticsearch.__com/view/Hadoop/job/es-hadoop-__nightly-2x/
<http://build.elasticsearch.com/view/Hadoop/job/es-hadoop-nightly-2x/>
On 7/14/14 1:19 PM, Aurélien wrote:
Hi,
I can't sort that ! I'm using hadoop CDH3u6, and trying to get ES index my data. I tried with raw json and
MapWritable,
I always get the same kind of errors :
|
java.lang.Exception:org.__elasticsearch.hadoop.__EsHadoopIllegalArgumentExcepti__on:[org.elasticsearch.hadoop.__serialization.field.__MapWritableFieldExtractor@__35b5f7bd]cannot
extract value fromobject[org.apache.hadoop.__io.MapWritable@11c757a1]
at org.apache.hadoop.mapred.__LocalJobRunner$Job.run(__LocalJobRunner.java:349)
Causedby:org.elasticsearch.__hadoop.__EsHadoopIllegalArgumentExcepti__on:[org.elasticsearch.hadoop.__serialization.field.__MapWritableFieldExtractor@__35b5f7bd]cannot
extract value fromobject[org.apache.hadoop.__io.MapWritable@11c757a1]
at org.elasticsearch.hadoop.__serialization.bulk.__TemplatedBulk$FieldWriter.__write(TemplatedBulk.java:49)
at org.elasticsearch.hadoop.__serialization.bulk.__TemplatedBulk.writeTemplate(__TemplatedBulk.java:101)
at org.elasticsearch.hadoop.__serialization.bulk.__TemplatedBulk.write(__TemplatedBulk.java:77)
at org.elasticsearch.hadoop.rest.__RestRepository.writeToIndex(__RestRepository.java:130)
at org.elasticsearch.hadoop.mr
<http://org.elasticsearch.hadoop.mr>.__EsOutputFormat$EsRecordWriter.__write(EsOutputFormat.java:161)
at org.apache.hadoop.mapred.__MapTask$__NewDirectOutputCollector.__write(MapTask.java:531)
at org.apache.hadoop.mapreduce.__TaskInputOutputContext.write(__TaskInputOutputContext.java:__80)
at my.jobs.index.IndexMapper.map(__IndexMapper.java:27)
at my.jobs.index.IndexMapper.map(__IndexMapper.java:19)
at org.apache.hadoop.mapreduce.__Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.__MapTask.runNewMapper(MapTask.__java:648)
at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:322)
at org.apache.hadoop.mapred.__LocalJobRunner$Job$__MapTaskRunnable.run(__LocalJobRunner.java:218)
at java.util.concurrent.__Executors$RunnableAdapter.__call(Executors.java:471)
at java.util.concurrent.__FutureTask$Sync.innerRun(__FutureTask.java:334)
at java.util.concurrent.__FutureTask.run(FutureTask.__java:166)
at java.util.concurrent.__ThreadPoolExecutor.runWorker(__ThreadPoolExecutor.java:1145)
at java.util.concurrent.__ThreadPoolExecutor$Worker.run(__ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.__java:724)
|
Seems to me that all is right, here the configuration of the index mapper :
|
Jobjob =newJob(getConf(),"Indexing into Elastic search.");
job.setJarByClass(getClass());
DomainRankDriver.__loadLibrariesToDistributedCach__e(job);
Pathinput =newPath(args[0]);
FileInputFormat.addInputPath(__job,input);
FileOutputFormat.__setOutputPath(job,newPath(__args[1]));
// Used by ES-hadoop to take Text as Json
job.setOutputFormatClass(__EsOutputFormat.class);
// job.setMapOutputValueClass(__Text.class);
job.setMapOutputValueClass(__MapWritable.class);
job.setMapperClass(__IndexMapper.class);
job.setNumReduceTasks(0);
|
And my simple mapper :
|
@Override
publicvoidmap(LongWritablekey,__Textvalue,Contextcontext)
throwsIOException,__InterruptedException{
MapWritablemap =newMapWritable();
map.put(newText("test"),__newText("value"));
context.write(newLongWritable(__),map);
}
|
Any clue to search for more ? I'm stuck.
Thanks,
Aurelien
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@__googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch+__unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/7f6545ab-__d6d9-4fdf-8923-0b60e0ea5297%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/7f6545ab-d6d9-4fdf-8923-0b60e0ea5297%40googlegroups.com>
<https://groups.google.com/d/__msgid/elasticsearch/7f6545ab-__d6d9-4fdf-8923-0b60e0ea5297%__40googlegroups.com?utm_medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/7f6545ab-d6d9-4fdf-8923-0b60e0ea5297%40googlegroups.com?utm_medium=email&utm_source=footer>>.
For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.
--
Costin
--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__topic/elasticsearch/__O1sJ4UQyZNU/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/O1sJ4UQyZNU/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@__googlegroups.com
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/53C3FBCE.__9040302%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/53C3FBCE.9040302%40gmail.com>.
For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2B4E3CZSSFWNHeRBQF1FYSBGcX2c_BiXK_vFxg%3D7y%2BL9wZd9nw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CA%2B4E3CZSSFWNHeRBQF1FYSBGcX2c_BiXK_vFxg%3D7y%2BL9wZd9nw%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.
--
Costin
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53C7A55C.90106%40gmail.com.
For more options, visit https://groups.google.com/d/optout.