I have a document property in an ElasticSearch index called 'processing.'
It's a boolean value that records whether or not an external program is
running that is working with that document. I am running many external
processors, but documents cannot be processed by multiple programs at once.
Am I able to select a document that is not being processed and update the
'processing' property to TRUE without another processor sneaking in and
choosing the document? Is there an UPDATE WHERE type operation I can
perform that will return the _id of the document that was updated?
On Mon, 2012-11-26 at 21:49 -0800, Brian Jones wrote:
I have a document property in an Elasticsearch index called
'processing.' It's a boolean value that records whether or not an
external program is running that is working with that document. I am
running many external processors, but documents cannot be processed by
multiple programs at once.
Am I able to select a document that is not being processed and update
the 'processing' property to TRUE without another processor sneaking
in and choosing the document? Is there an UPDATE WHERE type operation
I can perform that will return the _id of the document that was
updated?
Yes you can - use the update api, pass the current version number and
leave retry_on_conflict set to zero:
alternatively, you could just use the create and delete APIs. If you try
to create a document with a particular ID and it fails, then something
else has already created it. Once the job is done, delete the document.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.