Bulk API - Update with timestamp based external versioning


(Rob Bygrave) #1

Hi,

Q: Is there any way to set the next _version value using Bulk 'update' request with 'doc' request or is it always just incremented by 1?

Longer description of the issue:

In general there are 2 external versioning strategies that I am looking at using with the Bulk API.
A) Incrementing long value (works as expected - all good)
B) Timestamp based ("Last Update") long value

I'm using the Bulk API and with 'update' and 'doc' and specifying the _version value.

When the actual external strategy is A) Incrementing long value ... then everything works as expected - all good. I believe this is because ElasticSearch internally increments the _version value by 1 and this behavior is consistent with the external strategy.

When the actual external strategy is B) Timestamp based long value ... then the first 'update' request works as expected but subsequent 'update' requests fail with VersionConflictEngineException. The reason being that _version value has incremented by 1 and instead I really wanted to set _version to the new 'last update timestamp' as part of the request. I have tried a few options to supply a the next value for _version (including it in the 'doc', including it at the same level of the 'doc') but instead the _version is always just incremented by 1.

My Bulk Update requests look like:

... previous indexed doc 1 etc.

A) Incrementing long value

_version is incremented to 8 when successful .. all good as the ElasticSearch behavior mirrors the external versioning strategy (The external system thinks the next version is 8 as well).

{"update":{"_id":"1","_type":"foo","_index":"foo","_version":7}}
 {"doc":{"name":"modified name"}}

B) Using Timestamps ("Last Update") long value
_version incremented by 1 to 1432993069195 when successful ..

{"update":{"_id":"1","_type":"foo","_index":"foo","_version":1432993069194}}
 {"doc":{"name":"modified name"}}

However, the external system thinks the _version should now be some later timestamp value (say 1432993069293) rather than just increment by 1. Ideally I could specify the next _version value to be some timestamp supplied as part of the update request rather than just increment by 1.

specify next _version value at 'doc' level ... does not work, _version still incremented by 1

{"update":{"_id":"1","_type":"foo","_index":"foo","_version":1432993069194}}
 {"doc":{"name":"modified name"},"_version":1432993069293}

specify next _version value in document ... does not work, _version still incremented by 1

{"update":{"_id":"1","_type":"foo","_index":"foo","_version":1432993069194}}
 {"doc":{"name":"modified name","_version":1432993069293}}

Hopefully that makes sense.

Thanks, Rob.


(system) #2