Updating/upserting using Cascading/Scalding with a pre-defined _id field

Hi. I've been creating Scalding Taps to read/write to elasticsearch (using
cascading EsTaps underneath), and everything has been going fine when
writing to documents using insert operations
(es.write.operation=ES_OPERATION_INDEX). I am writing to a preexisting
resource that I created through a REST call that defines the _id field:

curl -XPUT http://host:port/test_index/?pretty -d '
{
"mappings": {
"test_document" : {
"_id" : {
"path" : "some_id_field"
}
}
}
}
'

This allows my EsTap to automatically use the "some_id_field" (from the
cascading source) as the _id when it gets written to elasticsearch.
Unfortunately, when setting the es.write.operation property to
ES_OPERATION_UPDATE or ES_OPERATION_UPSERT, it no longer automatically uses
the some_id_field that I specified when I created the index. If I do not
specify the field with the "es_mapping_id" property, I get either a

Operation [%s] requires an id but none was given/found

or

Operation [%s] requires an id but none (%s) was specified

error, depending on whether I'm trying to update or upsert.

My question is: is it possible to not have to specify a es_mapping_id
property when updating/upserting (which would be less painful, since we
aren't manually specifying fields when we load the data, anyway), or can I
somehow retrieve the "path" value for _id and pass it along to the
cascading tap for any update/upsert operation?

Thanks,

Andy

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a920e676-5de0-4cd5-b7b7-15b7d7f9b09f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.