Its very simple, Cassandra delegates that overhead when it does read repair,
so basically, if your node is down long enough, you would end up doing the
same piggybacking the API invocation of the user, and I am putting aside the
fact that you can loose operations since if a node "shard" is not available
to route to, it writes locally and retries later... (unless something
changed, thats what I know). Hadoop does something similar to what
elasticsearch does.
There is no read repair in elasticsearch (can be implements of course, as
anything else in software, but it will kill your performance, search engines
provide an order of magnitude more query capabilities than cassandra, thus
the difficulty).
Since read repair is not possible, and you can't tell in distribute systems
which node went down last, or what was the state of the cluster when it was
brought down (for example, you bring down machine A, and after 10 minutes to
bring down machine B, while the system is actively indexing data. Then you
start machine A, and after 10 minutes start machine B, you can basically
loose most of the data that was indexed on machine B), then elasticsearch
relies on an external storage to provide that master record when it boots.
Having said that, the recovery from the gateway can be (and should be)
heavily optimized, which I have on my "roadmap" to do. There is no reason to
recover "files" that existed when the node was brought down from the gateway
because of the write once nature of files in search engines, they can be
reused. Its a bit tricky, but can be done.
-shay.banon
On Thu, Jun 3, 2010 at 12:48 PM, Ori Lahav olahav@gmail.com wrote:
Shay,
Cassandra (and I think Hadoop also) is recovering from local storage for
sure and then get only the missing data from other nodes.
Solr replicas (as well as any other system based on replication)
is also recovering from local disk and then rolling changes till they get to
sync with master.
I don't see why ES shard can't recover from local disk with i's own log
position and then roll the transactions log for changes from the master.
I understand that ES was planed to run on cloud solutions where you more
frequently switch servers, but even on cloud if you just want to upgrade
code or take the service down for a minute, there is no need to move gigs of
data on the wire it just doesn't make sense.
On Tue, Jun 1, 2010 at 11:25 PM, Shay Banon shay.banon@elasticsearch.comwrote:
Also, a shard will read its data from the gateway only when its the first
time it gets instantiated. Otherwise, it will recovery its state from a
replica (which still takes its toll, but manageable).
By the way, do you have the same problem with cassandra / hadoop doing the
same? Cause they do
On Tue, Jun 1, 2010 at 11:16 PM, Shay Banon <shay.banon@elasticsearch.com
wrote:
This can't be done due to consistency issues in distributed systems.
On Tue, Jun 1, 2010 at 5:21 PM, Yatir Ben Shlomo yatirb@gmail.comwrote:
Hi,
If I am using a large index (10's of GBs)
and a cloud node is restarting then it takes a long time for the node
to read all the index from the gateway.
In cloud environments this is understandable but if this is node is
running as part of my LAN
I would be happy if I could have configured the machine to reuse its
files in the work directory (if they exist) and in the background to
whatever it takes to synchronize with the data on the gateway
but until the data is synchronized lets have the node use its local
files.
--
http://olahav.typepad.com