My apologies, Stefan. I found your earlier question
(https://groups.google.com/d/topic/elasticsearch/IU0b09LYs98/discussion),
and Adrien's answer along with your clarification helps.
Again, I am only a (very happy) user of ES and not a developer of ES. But
my experience has been that once the delete completes and returns to my
caller, that record is as if it's gone (whether hidden or expunged, it
doesn't seem to matter).
So I would hazard a guess and say that out of order actions do matter. In
fact, I have one application where I get a stream of updates as a mix of
either add/update or delete. The order is important; the customer has
specifically ordered the transactions so that the most recent update to a
document is the one that should be reflected after all updates have been
applied. Therefore, I never split an update stream to multiple threads;
instead, I always issue the updates sequentially so that they are always
applied in the correct order. Happily, ES performance and back-end
threading is such that they breeze through. (And all this is against an
index with about 97 million documents with daily updates totaling perhaps
200K to several million per day).
I haven't fully explored external versioning, but the little I've explored
it leads me to think that as long as your application can assign version
numbers that correspond to the order, then out of order transactions will
still yield the correct final result.
For example, if your v1, v2, and v3 are indeed increasing version numbers,
then the following two series will result in the same state: the document
is deleted.
index v1, delete v3, index v2
index v1, index v2, delete v3
This guess is based on my experience that even a deleted document's version
is somehow seen by ES. If I use internal version numbers, then a
successful sequence of {create, index, delete, index, delete, index}
results in a document version of 6. But I'm not fully sure how externalversioning works with deleted documents. If it works as it does with
internal versioning, then you should be OK.
Brian
On Monday, July 29, 2013 12:26:34 PM UTC-4, Stefan Fußenegger wrote:
Brian,
Thanks for your effort, but I'm not talking about expired documents (as in
TTL) but actually and manually deleted ones (as described in my previous
questionhttps://groups.google.com/d/msg/elasticsearch/IU0b09LYs98/Z8ysJe9wdmEJ).
Deleted documents are still considered for optimistic locking purposes. By
default, they are fully expunged after 60s (index.gc_deletes). But as
this behaviour/setting isn't well documented, I'm looking for answers here.
This is useful if operations appear out of order (e.g. index v1, delete
v3, index v2). As I said, it does work. All I'm trying to figure out is how
reliable it is or what other settings I might temporarily change (e.g.
disabling optimizations).
My usecase is a tool that synchronizes indices to facilitate upgrades
with zero downtime https://github.com/molindo/molindo-elasticsync.
Thanks, Stefan
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.