We are battling to understand why version conflicts occur and why retry_on_conflict is a sensible strategy to resolving them.
Consider Document _id: 1 which has value foo: 1
and _version: 1
. If several processes try to update this:
- AppProcessX:
foo: 2
- AppProcessY:
foo: 3
Then I expect that the first process writes foo: 2, _version: 2
and the next process writes foo: 3, _version: 3
. The order which ElasticSearch gets the requests should matter. Since we do not provide a _version
parameter from our ES client, we are saying. just update, don't worry about overwriting wrong values in the wrong order. We expect ES to do optimistic locking and not throw version errors.
If we assume that ES has multiple processes which process these updates in the "wrong" order - i.e. not in the order in time that the App Processes generated them, we expect ES to just write them in the wrong order foo: 2
will be the final and incorrect result.
- ESProcess1:
foo: 3
- ESProcess2:
foo: 2
We do not understand how ES gets mixed up - is ES reading _version: 1
shortly before it writes foo: 3
then it will write foo: 3, _version: 2
and then foo: 2, _version: 3
. This is fine for our purposes.
How can ES decide that foo: 2
is in conflict with foo: 3
. I can only imagine that ES is caching the _version
field BEFORE the ES Queue is processed so it gets:
- ESProcess1:
foo: 2, _version: 1
=> OK_version: 2
- ESProcess1:
foo: 3, _version: 1
=> FAILversion_conflict_engine_exception
The failure occurs because ES "knows" it should be updating _version: 1
to _version: 2
but it then sees that _version
is already 2
.
Is there any way, other than a retry to just ignore versioning?
How is a retry solving any issues related to sorting??? IMHO ES should just retry until it succeeds (if that's what the developer/client wants and doesn't care about ordering) or just fail if the developer wants ordering. If the developer wants ordering they should update by providing a _version
field.
I can't see why retrying some golden number is a good strategy.