ElasticSearch document versioning

Hello,

I have a question if it's possible to replace the document with same
version? The reason why I'm trying to do it, is situation such that:

  • inserting new doc with external version
  • let's say that this document has been noticed to be corrupted in some way
  • we want to replace it, but keep same version as it was, because it's
    external
    I know that I can remove it and insert it again with given version, but
    this is not atomic operation.
    So are they any other ways that guarantees atomic update in such situation?

Thanks in advance,
Krzysztof Janosz

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3243d650-4a2c-4880-8dc1-e8d131536558%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Unfortunately you can't do this if you're using external versioning. I
would just up the version of the external source and then use that to
replace the indexed doc.

Also keep in mind that if you delete a doc, you cannot "immediately" index
a new similar or lower versioned doc to replace it. You'd have to wait 60
seconds to be able to "replace" it. Just FYI.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7406f34c-b8f4-4621-9923-b310ebdd3fc0%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you for your response.

I've looked into source code of elasticsearch and it seems that change in
VersionType.EXTERNAL (or creating new similar one):
public boolean isVersionConflict(long currentVersion, longexpectedVersion
) {
return currentVersion != Versions.NOT_SET && currentVersion !=
Versions.NOT_FOUND
&& (expectedVersion == Versions.MATCH_ANY ||currentVersion

= expectedVersion);
}
to

    public boolean isVersionConflict(long currentVersion, longexpectedVersion

) {
return currentVersion != Versions.NOT_SET && currentVersion !=
Versions.NOT_FOUND
&& (expectedVersion == Versions.MATCH_ANY ||currentVersion

expectedVersion);
}
allows to set external version ID same as previous. (or at least, allows it
in simple test case ;))
Is it so simple or does this carry some additional danger I can't catch at
the moment?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a5d723cc-6e90-47df-8950-94ea04597a77%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

With that change, you would open the door to hell :wink: Version conflicts must
arise if there is more than one request with the same version.

Jörg

On Thu, Mar 6, 2014 at 12:14 PM, janosz.krzysztof@gmail.com wrote:

Thank you for your response.

I've looked into source code of elasticsearch and it seems that change in
VersionType.EXTERNAL (or creating new similar one):
public boolean isVersionConflict(long currentVersion, longexpectedVersion
) {
return currentVersion != Versions.NOT_SET && currentVersion !=
Versions.NOT_FOUND
&& (expectedVersion == Versions.MATCH_ANY ||currentVersion

= expectedVersion);
}
to

    public boolean isVersionConflict(long currentVersion, longexpectedVersion

) {
return currentVersion != Versions.NOT_SET && currentVersion !=
Versions.NOT_FOUND
&& (expectedVersion == Versions.MATCH_ANY ||currentVersion

expectedVersion);
}
allows to set external version ID same as previous. (or at least, allows
it in simple test case ;))
Is it so simple or does this carry some additional danger I can't catch at
the moment?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFC55BsVEWkLehxAaoh6WYDvxKjvdCvRqWZWPCuwfB2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hmm, sorry I was not precise enough - I mean, not resolving version
conflicts with that way at every insert, but let's say creating some force
mode, which would be only used in case described in my first post and
assuming that this force mode is used only in "proper way" - e.g we ensure
that at given point in time only one request with such update will come to
es. Is this still creating problem, e. g. such data when updated in
described "force mode" would not be propagated to other nodes resulting in
data incosistency?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ac0f4433-9560-42a1-8745-4aae14d82f1b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I think you misunderstand the versioning concept in MVCC.

In ES, it only looks like ES docs are attached with a version and smart
client might be capable of "superseding" versioning at will. But the
concept is that versions are attached within a read/write cycle to a client
that is dealing with the doc. It is the client that must be absolutely sure
that the writing is "valid" - new data is written according to the version
that was read before. Keeping the same version means, the document
read/write cycle is not corrupted.

So if you suggest a "force mode", you disable this causality. Other clients
read a version, then a "force mode" client touches the doc, but the version
stays the same. All other clients that had read the old version are messed
up, they can no longer trust the versioning, or the data they read, or both.

So, if you don't want MVCC, you should just ignore versioning in your
concept...

Jörg

On Thu, Mar 6, 2014 at 12:36 PM, janosz.krzysztof@gmail.com wrote:

Hmm, sorry I was not precise enough - I mean, not resolving version
conflicts with that way at every insert, but let's say creating some force
mode, which would be only used in case described in my first post and
assuming that this force mode is used only in "proper way" - e.g we ensure
that at given point in time only one request with such update will come to
es. Is this still creating problem, e. g. such data when updated in
described "force mode" would not be propagated to other nodes resulting in
data incosistency?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ac0f4433-9560-42a1-8745-4aae14d82f1b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ac0f4433-9560-42a1-8745-4aae14d82f1b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEUUvn3ZCJs2Jpr0axvrnmG_72zosAZ59FWPthMtumDUQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.