Will ES discard redundant replicas?


(Zaar Hai) #1

Suppose I have a cluster with nodes N1 and N2 hosting index I1 with one primary shard S1 and one replica R1.

The scenario is as follows:

  1. Shutdown node N2
  2. Launch new, fresh node N3
  3. Wait until ES creates R3 on S3
  4. Boot N2.

No we have a scenario where I1 has two replicas instead of one.

  1. Will ES be smart enough to discard redundant replica?
  2. What happens if N2 comes back while ES is still in the process of copying S1 to R3? - Will it abort copying and use R2?

I'm talking about ES 2.x. Answers regarding 5.x are welcome as well.

Thank you,
Zaar


(Nik Everett) #2

Sure.

Its complicated. Newer versions of Elasticsearch can cancel replication in progress and I believe in 2.x its possible for it to do this abort but I could be wrong. I don't know that code super well.

When N2 comes back Elasticsearch has to make sure that R2 is the same as R1. It can do this in two ways:

  1. the files are the same
  2. synced flush

If it doesn't see R2 as the same as R1 then it'll take into account how much of it is the same when determining which shard is "further along".


(Zaar Hai) #3

Great. Thank you!


(system) #4