How does Elasticsearch protect from data loss?

Hi,

I would like to know whether (and how) ES protects me from data loss in the following case:
I have an index with the default 5 shards, spread to three machines. The index has replica count of 1 (two replicas).
In the default setting if one of the replicas fail, writes still succeed.
So what happens when I loose one machine, which is the slave for shard 1, write to the shard, which gets into the shard 1 master, then I loose shard 1 master?
A write has been acknowledged, but the shard was not fully replicated to the set replica count.
Now if the previous shard 1 slave comes up, will I loose the acknowledged write, or ES knows that previous shard 1 master had a newer version of the index and won't let it come up until the old shard 1 master comes up?

Does ES has a safety mechanism which covers multiple node losses?

Thanks,

Hi,
elasticsearch uses sync replication, that means that when the index api returns, the document has been written successfully on the primary and all of its assigned replicas. The response of the index api is then indicative of whether the write as a whole (to all shard copies involved) was successful or not.

Does this clarify things for you?

Cheers
Luca

Hi,

Well, I'm not sure it covers the case I've written about.
Trying to elaborate it more and trying to get it minimal:
I have two machines, A and B. I have an index with one shard (S0) and replica count=1.
S0 master is on A, slave is on B.
What will happen if I:

  1. halt B
  2. write a document (goes to A, succeeds by default)
  3. halt A
  4. start B
  5. start A

What do I see in point 4 and in 5? Will I have my document?

BTW, it can happen with higher replica count (where a majority can be reached) and higher number of machines, of course more events must happen there.

Thanks for the quick response.

This is where master eligibility comes into play.

With only two master nodes, you are lost, especially if you swap the order of the nodes, this determines how they form the cluster.

The lesson is to set up more than two master nodes, set minimum master to >=2 nodes before recovery starts, and select an odd number of master nodes, so a split brain can not happen.

I made the mistake of talking about primary and replica shards as master and slave and not describing the whole situation, because I didn't want to make it more difficult with the master node coming into the picture.

The term master in ES nomenclature is used for master nodes.
So what I would like to describe is zen has the needed number of machines (at least 3, or 5), they are fine and there are dedicated data nodes with replica_count=1, so each replica can span only two machines.
Setting -if I understand you right- discovery.zen.minimum_master_nodes to >=2 in the above situation yields the same result. Master doesn't know which data node has the more up to date content (but please correct me if I'm wrong).

Does setting replica count >=2 prevents the possibility of data loss?

Hi Attilia,

Running with two copies (1 primary, 1 replica) means you can only take two faults before things go wrong. In your case the replica went down. If the primary is also goes down before a replica can be restored then you indeed have a problem.

If the replica then comes back before the primary is re-discovered, Elasticsearch will currently choose the replica as the new primary, thinking that this is the only copy that survived and we are better off using that then not having anything at all. This decision should not be made automatically (as it is right now). You, the user, may know that the primary is not lost and is coming back in a couple of minutes and you might want to wait. We are working on making it an explicit API command (though I can't find the ticket now)

Cheers,
Boaz