Serious Availability Challenges w/ Replica Recovery Process in Cloud Environments

(Nariman Haghighi) #1

I've read as much as I can get my hands on with respect to cluster restarts
and optimizing replica recovery [1] on 1.3.2 and I still think there's a
big problem here.


  • 2 node cluster running on Azure (2 large nodes: 7GB RAM, 4 cores, 400
    Mbps network)
  • 600GB disk on each side, current cluster is using about 25GB of that on
    each node
  • Cluster must be resilient to index requests at all times! This is a core
    requirement that we can't get around.

What works pretty reliably in this scenario is that after each node is
restarted, you wait for the full sync before upgrading the other node
(GREEN status). Anything else will cause data loss.

The trouble with this approach is we are dealing with network capacity of
400 Mbps and while 25 GB can be transferred in a reasonable time... 600GB
definitely can't.

On Azure, you have a max of 15 minutes OnStart to make sure the instance
is healthy again before the fabric controller recycles it for you. And I
think that limit is reasonable.

Sure we could go up 1 size to Extra Large nodes that offer a 800 Mbps
network but still this isn't going to get us to the point where we can
maintain 100% availability of a 600GB cluster on 2 nodes.

There is a bit of light at the end of the tunnel:

We plan on assigning a sequence number to operations that occur on primary

shards.This is a really interesting feature that will lay the groundwork
for many future features in Elasticsearch. The most obvious one is speeding
up the replica recovery process when a node is restarted. Currently we have
to copy every segment which is different which, over time, means every
segment! Sequence numbers will allow us to copy over only the data that has
really changed.

It couldn't come soon enough for us. The current replica recovery process
is really untenable in cloud environments that carry heavy INDEX workloads.

Any thoughts on when this can be expected or if there are viable
workarounds in the meantime would be greatly appreciated.

[1] - Research:!searchin/elasticsearch/cluster$20restart$20disable$20allocation/elasticsearch/csziKQPBauU/9PKWbkhJ50IJ!searchin/elasticsearch/cluster$20restart/elasticsearch/lN6copl0Bzk/RK5ESX8nu-8J!searchin/elasticsearch/cluster$20restart/elasticsearch/plTGgtE_YCU/fCyd2elxkAsJ

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit
For more options, visit

(system) #2