Enabling Live/Live Disaster Recovery while avoiding split brain

David_Artus · December 11, 2014, 4:44pm

My understanding is that recovery from Split Brain situations is
troublesome and we are encourages to ensure that a cluster is only active
if there is a quorum of candidate masters , we do that by setting
discovery.zen.minimum_master_nodes to ((cm/2) + 1), that is a true
majority of candidate masters.

I also see that many folks want to enable resilience in the event of
disaster by running a cluster across two data centres. Lose one data center
and we just keep running in the other - we still have half our servers, so
with correct over-provisioning we can cope with the workload. No need to
rebuild the cluster and restore from backup. We have what may be called
Live/Live Disaster Recovery (DR)

I see two issues with this approach to DR. If we lose half our candidate
master nodes, by definition we cannot achieve a quorum, we won't have a
true majority.

A slightly more subtle problem is that we probabaly also set
gateway.recover_after_nodes to a high value (say 8 of 10) so that in
normal running with a few missing nodes we don't inadvertantly get into
shard shuffling. Once again with a loss of half our estate we will not have
enough nodes to satisfy that setting.

My conclusion is that while Live/Live operation across two Data Centers is
possible the transition to a reduced state requires reconfiguration, the
system will not "just keep running" unless we open ourselves to Split Brain
and replica shuffling.

Comments please?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f494cb6e-c69e-4cb6-a9a6-c8b102fe1372%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · December 11, 2014, 6:11pm

It's not recommended to do cross DC clusters for these reasons (and more).

You're be better off having two clusters and then syncing data between them.

On 11 December 2014 at 17:44, David Artus djna01@gmail.com wrote:

My understanding is that recovery from Split Brain situations is
troublesome and we are encourages to ensure that a cluster is only active
if there is a quorum of candidate masters , we do that by setting
discovery.zen.minimum_master_nodes to ((cm/2) + 1), that is a true
majority of candidate masters.

I also see that many folks want to enable resilience in the event of
disaster by running a cluster across two data centres. Lose one data center
and we just keep running in the other - we still have half our servers, so
with correct over-provisioning we can cope with the workload. No need to
rebuild the cluster and restore from backup. We have what may be called
Live/Live Disaster Recovery (DR)

I see two issues with this approach to DR. If we lose half our candidate
master nodes, by definition we cannot achieve a quorum, we won't have a
true majority.

A slightly more subtle problem is that we probabaly also set
gateway.recover_after_nodes to a high value (say 8 of 10) so that in
normal running with a few missing nodes we don't inadvertantly get into
shard shuffling. Once again with a loss of half our estate we will not
have enough nodes to satisfy that setting.

My conclusion is that while Live/Live operation across two Data Centers is
possible the transition to a reduced state requires reconfiguration, the
system will not "just keep running" unless we open ourselves to Split Brain
and replica shuffling.

Comments please?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f494cb6e-c69e-4cb6-a9a6-c8b102fe1372%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f494cb6e-c69e-4cb6-a9a6-c8b102fe1372%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8wZN4wVBsEsMbV8xUbGYXRo%2B8jnq1URJStnGLOTk%3Dbug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

elvarb · December 11, 2014, 7:00pm

If I remember correctly, version 1.4 can turn nodes that cant connect to
the cluster to read only mode.

On Thursday, December 11, 2014 4:44:28 PM UTC, David Artus wrote:

My understanding is that recovery from Split Brain situations is
troublesome and we are encourages to ensure that a cluster is only active
if there is a quorum of candidate masters , we do that by setting
discovery.zen.minimum_master_nodes to ((cm/2) + 1), that is a true
majority of candidate masters.

I also see that many folks want to enable resilience in the event of
disaster by running a cluster across two data centres. Lose one data center
and we just keep running in the other - we still have half our servers, so
with correct over-provisioning we can cope with the workload. No need to
rebuild the cluster and restore from backup. We have what may be called
Live/Live Disaster Recovery (DR)

I see two issues with this approach to DR. If we lose half our candidate
master nodes, by definition we cannot achieve a quorum, we won't have a
true majority.

A slightly more subtle problem is that we probabaly also set
gateway.recover_after_nodes to a high value (say 8 of 10) so that in
normal running with a few missing nodes we don't inadvertantly get into
shard shuffling. Once again with a loss of half our estate we will not
have enough nodes to satisfy that setting.

My conclusion is that while Live/Live operation across two Data Centers is
possible the transition to a reduced state requires reconfiguration, the
system will not "just keep running" unless we open ourselves to Split Brain
and replica shuffling.

Comments please?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6dcfdeed-370f-426a-a5b1-f72767838d7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · December 11, 2014, 10:38pm

Yes, this is new in 1.4.

On 11 December 2014 at 20:00, Elvar Böðvarsson elvarb@gmail.com wrote:

If I remember correctly, version 1.4 can turn nodes that cant connect to
the cluster to read only mode.

On Thursday, December 11, 2014 4:44:28 PM UTC, David Artus wrote:

My understanding is that recovery from Split Brain situations is
troublesome and we are encourages to ensure that a cluster is only active
if there is a quorum of candidate masters , we do that by setting
discovery.zen.minimum_master_nodes to ((cm/2) + 1), that is a true
majority of candidate masters.

I also see that many folks want to enable resilience in the event of
disaster by running a cluster across two data centres. Lose one data center
and we just keep running in the other - we still have half our servers, so
with correct over-provisioning we can cope with the workload. No need to
rebuild the cluster and restore from backup. We have what may be called
Live/Live Disaster Recovery (DR)

I see two issues with this approach to DR. If we lose half our candidate
master nodes, by definition we cannot achieve a quorum, we won't have a
true majority.

A slightly more subtle problem is that we probabaly also set
gateway.recover_after_nodes to a high value (say 8 of 10) so that in
normal running with a few missing nodes we don't inadvertantly get into
shard shuffling. Once again with a loss of half our estate we will not
have enough nodes to satisfy that setting.

My conclusion is that while Live/Live operation across two Data Centers
is possible the transition to a reduced state requires reconfiguration, the
system will not "just keep running" unless we open ourselves to Split Brain
and replica shuffling.

Comments please?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6dcfdeed-370f-426a-a5b1-f72767838d7f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6dcfdeed-370f-426a-a5b1-f72767838d7f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9yE3c0%3D27yNqq6LkANYE2XQgZanhdPkooRbwmJ7VOLug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.