Hi,
I am trying to do a Bulk update of ElasticSearch index using MapReduce job
. I am using TransportClient.
Things are working fine, and all documents got index, when I am using
ElasticSearch internal Version control .
But I wanted to propagate the version from external source (in my case it
is HBase, and the Map Reduce is doing Indexing of HBase columns). If I use
the external version, the index behavior become sporadic. Not all documents
get indexed, and individual run of Map Reduce job shows different number of
documents being indexed.
Below is the code.. The client is TransPortClient, and I am getting the
version from HBase, which is in idx[ ]..
If I comment, the setVersion and setVersionType (i.e. use ElasticSearch
internal version), things works fine.
Below code is executing within my Reduce task. And Map task basically pulls
all data from HBase and give it to Reduce..
client.prepareIndex("index", "product", idx[0])
.setVersion(Long.parseLong(idx[1]))
.setVersionType(VersionType.EXTERNAL)
.setOperationThreaded(false).setSource(builder.string())
.execute().actionGet();
Do let me know, if there is any issue with this approach.
Regards,
Dibyendu
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.