Can I perform the equivalent of a MySQL transaction in ElasticSearch?

I have a document property in an ElasticSearch index called 'processing.'
It's a boolean value that records whether or not an external program is
running that is working with that document. I am running many external
processors, but documents cannot be processed by multiple programs at once.

Am I able to select a document that is not being processed and update the
'processing' property to TRUE without another processor sneaking in and
choosing the document? Is there an UPDATE WHERE type operation I can
perform that will return the _id of the document that was updated? Can I
perform transactions with ElasticSearch?

Is there an alternative or best practice way to accomplish this with
ElasticSearch?

--

On Tuesday, 27 November 2012 06:53:44 UTC+1, Brian Jones wrote:

Am I able to select a document that is not being processed and update the
'processing' property to TRUE without another processor sneaking in and
choosing the document? Is there an UPDATE WHERE type operation I can
perform that will return the _id of the document that was updated? Can I
perform transactions with Elasticsearch?

Elasticsearch's versioning includes a CAS-like functionality which should
help with this: see
Elasticsearch Platform — Find real-time answers at scale | Elastic ("Versioning")
and Elasticsearch Platform — Find real-time answers at scale | Elastic

Klaus

--

I've tried implementing this but it doesn't seem like the Elasticsearch API
supports versioning on updates. The _update below should have failed
because the document was at version 5 when the update was performed with a
version of 3 in the POST data.

REQUEST: curl -XPOST
'host:9200/companies/company/43/_update?version=3&0pretty=true' -d '{
"script" : "ctx._source.processing = true" }'

RESPONSE: {"ok":true,"_index":"companies","_type":"company","_id":"43","_version":6}

Is versioning available with updates or only with PUTs when replacing the
entire document? Perhaps I'm not writing my scripts correctly for this
kind of update. The scripting documentation is very confusing for me.

Perhaps there is a simpler way to update a single property of a document
that I'm missing.

On Monday, November 26, 2012 10:27:35 PM UTC-8, Klaus Brunner wrote:

On Tuesday, 27 November 2012 06:53:44 UTC+1, Brian Jones wrote:

Am I able to select a document that is not being processed and update the
'processing' property to TRUE without another processor sneaking in and
choosing the document? Is there an UPDATE WHERE type operation I can
perform that will return the _id of the document that was updated? Can I
perform transactions with Elasticsearch?

Elasticsearch's versioning includes a CAS-like functionality which should
help with this: see
Elasticsearch Platform — Find real-time answers at scale | Elastic and
Elasticsearch Platform — Find real-time answers at scale | Elastic

Klaus

--

Some further research leads me to believe that _updates request the full
document, update the property specified and then replace the old document
with the new document, meaning _update may save on sending large documents
over the network, but don't save on a request to the cluster to get the
document, so if the client already has the document, I should just update
the data client side and then replace the old document with a put request.
Am I correct here?

On Tuesday, November 27, 2012 7:15:40 PM UTC-8, Brian Jones wrote:

I've tried implementing this but it doesn't seem like the Elasticsearch
API supports versioning on updates. The _update below should have failed
because the document was at version 5 when the update was performed with a
version of 3 in the POST data.

REQUEST: curl -XPOST
'host:9200/companies/company/43/_update?version=3&0pretty=true' -d '{
"script" : "ctx._source.processing = true" }'

RESPONSE: {"ok":true,"_index":"companies","_type":"company","_id":"43","_version":6}

Is versioning available with updates or only with PUTs when replacing the
entire document? Perhaps I'm not writing my scripts correctly for this
kind of update. The scripting documentation is very confusing for me.

Perhaps there is a simpler way to update a single property of a document
that I'm missing.

On Monday, November 26, 2012 10:27:35 PM UTC-8, Klaus Brunner wrote:

On Tuesday, 27 November 2012 06:53:44 UTC+1, Brian Jones wrote:

Am I able to select a document that is not being processed and update
the 'processing' property to TRUE without another processor sneaking in and
choosing the document? Is there an UPDATE WHERE type operation I can
perform that will return the _id of the document that was updated? Can I
perform transactions with Elasticsearch?

Elasticsearch's versioning includes a CAS-like functionality which should
help with this: see
Elasticsearch Platform — Find real-time answers at scale | Elastic and
Elasticsearch Platform — Find real-time answers at scale | Elastic

Klaus

--

On Tue, 2012-11-27 at 19:15 -0800, Brian Jones wrote:

I've tried implementing this but it doesn't seem like the
Elasticsearch API supports versioning on updates. The _update below
should have failed because the document was at version 5 when the
update was performed with a version of 3 in the POST data.

That's interesting - it looks like update doesn't support versioning

which kinda makes sense from the concept of "updates are idempotent" and
"retry updates retry_on_conflict times"

So yes - it looks like you either retrieve the document, update it, and
reindex it with a version, or use the create/delete process that I
described in another mail

clint

--