I maintain a system where I generate the document IDs before performing bulk index, update and delete operations using these IDs. The main difference from your example is that I use "id" and "index", without a leading underscore, rather than "_id" and "_index".
Here's how I build up my bulk operation (in the Perl programming language):
for (my $doc (@doc_array)) {
my $payload = { # build a new payload
id => $doc->{id},
type => $fixed_type, # TODO: remove in ES7
index => $indexname,
};
if ($doc->{action} ne 'delete' ) { # add document as 'source' or 'doc' depending on the action
if ($doc->{action} eq 'update' ) { # just add the partial 'doc'
$payload->{doc} = $doc->{partial};
$payload->{doc_as_upsert} = 'false';
$payload->{detect_noop} = 'true';
} else { # for new documents add the full document in 'source'
$payload->{source} = $doc->{full};
$payload->{pipeline} = $pipeline if $pipeline; # only supported for indexing new docs
}
}
$bulk->add_action( $doc->{action} => $payload );
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.