Version priority

Hey folks,

I have a queue feeding a bunch of workers that are feeding into ES
with bulk index requests. In many cases, we create a document (which
goes into the queue) and then immediately update it.

Before the multiple workers, this was not a problem. but now we have a
race condition where the updated version of the document is actually
overwritten by the earlier version of the document.

For example, if the first version of the doc is: {"a":0} and the
second is {"a":1}, sometimes the second version is inserted first,
which means that the latest version of the doc is out of date.

is there any way to set priority on fields?

I'm looking into rearchitecting my workers, as well.

actually, is it possible to edit an old version? the docs aren't too
clear. I think I get a 409?

I could change my workers to make sure that {"a":1} was always written
to version=1? that way I make sure 1>0 in terms of "priority".

On Apr 18, 12:16 pm, Oren Mazor oren.ma...@gmail.com wrote:

Hey folks,

I have a queue feeding a bunch of workers that are feeding into ES
with bulk index requests. In many cases, we create a document (which
goes into the queue) and then immediately update it.

Before the multiple workers, this was not a problem. but now we have a
race condition where the updated version of the document is actually
overwritten by the earlier version of the document.

For example, if the first version of the doc is: {"a":0} and the
second is {"a":1}, sometimes the second version is inserted first,
which means that the latest version of the doc is out of date.

is there any way to set priority on fields?

I'm looking into rearchitecting my workers, as well.

Do you use the VersionType.EXTERNAL at all? Since you have a field of your
own data that is the version, you could set the version Property to be your
domain specific version instead of relying on ES's inbuilt versioning then
when the 1st version tries to overwrite the 2nd, you'll get a rejection
(VersionConflictException) which you can just swallow (though the exception
handling makes it a bit messy, since you only want to swallow this one
type, not other exceptions).

On 19 April 2012 02:23, Oren Mazor oren.mazor@gmail.com wrote:

actually, is it possible to edit an old version? the docs aren't too
clear. I think I get a 409?

I could change my workers to make sure that {"a":1} was always written
to version=1? that way I make sure 1>0 in terms of "priority".

On Apr 18, 12:16 pm, Oren Mazor oren.ma...@gmail.com wrote:

Hey folks,

I have a queue feeding a bunch of workers that are feeding into ES
with bulk index requests. In many cases, we create a document (which
goes into the queue) and then immediately update it.

Before the multiple workers, this was not a problem. but now we have a
race condition where the updated version of the document is actually
overwritten by the earlier version of the document.

For example, if the first version of the doc is: {"a":0} and the
second is {"a":1}, sometimes the second version is inserted first,
which means that the latest version of the doc is out of date.

is there any way to set priority on fields?

I'm looking into rearchitecting my workers, as well.

You can use versioning to solve this, yes. Or, if its a case of knowing
that you expect to always create a doc in one case, and update a doc in the
second, you can set the create flag on the index operation, which will fail
if the document already exists.

On Wed, Apr 18, 2012 at 7:23 PM, Oren Mazor oren.mazor@gmail.com wrote:

actually, is it possible to edit an old version? the docs aren't too
clear. I think I get a 409?

I could change my workers to make sure that {"a":1} was always written
to version=1? that way I make sure 1>0 in terms of "priority".

On Apr 18, 12:16 pm, Oren Mazor oren.ma...@gmail.com wrote:

Hey folks,

I have a queue feeding a bunch of workers that are feeding into ES
with bulk index requests. In many cases, we create a document (which
goes into the queue) and then immediately update it.

Before the multiple workers, this was not a problem. but now we have a
race condition where the updated version of the document is actually
overwritten by the earlier version of the document.

For example, if the first version of the doc is: {"a":0} and the
second is {"a":1}, sometimes the second version is inserted first,
which means that the latest version of the doc is out of date.

is there any way to set priority on fields?

I'm looking into rearchitecting my workers, as well.