We are battling to understand why version conflicts occur and why retry_on_conflict is a sensible strategy to resolving them.
Consider Document _id: 1 which has value
foo: 1 and
_version: 1. If several processes try to update this:
Then I expect that the first process writes
foo: 2, _version: 2 and the next process writes
foo: 3, _version: 3. The order which ElasticSearch gets the requests should matter. Since we do not provide a
_version parameter from our ES client, we are saying. just update, don't worry about overwriting wrong values in the wrong order. We expect ES to do optimistic locking and not throw version errors.
If we assume that ES has multiple processes which process these updates in the "wrong" order - i.e. not in the order in time that the App Processes generated them, we expect ES to just write them in the wrong order
foo: 2 will be the final and incorrect result.
We do not understand how ES gets mixed up - is ES reading
_version: 1 shortly before it writes
foo: 3 then it will write
foo: 3, _version: 2 and then
foo: 2, _version: 3. This is fine for our purposes.
How can ES decide that
foo: 2 is in conflict with
foo: 3. I can only imagine that ES is caching the
_version field BEFORE the ES Queue is processed so it gets:
foo: 2, _version: 1=> OK
foo: 3, _version: 1=> FAIL
The failure occurs because ES "knows" it should be updating
_version: 1 to
_version: 2but it then sees that
_version is already
Is there any way, other than a retry to just ignore versioning?
How is a retry solving any issues related to sorting??? IMHO ES should just retry until it succeeds (if that's what the developer/client wants and doesn't care about ordering) or just fail if the developer wants ordering. If the developer wants ordering they should update by providing a
I can't see why retrying some golden number is a good strategy.