Cross-DC clusters - specific dangers

I am aware that cross data-center clusters are not recommended, since they
violate one of the core assumptions of ES, namely that all nodes are equal.
But what specifically (apart from obvious problems associated with
network failure) can this lead too: is it just high or "irregular" latency
and the difficulty in debugging issues when node-reponse times are unequal,
or can more critical issues such as split brain also arise from this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a8fe93fb-dfd1-414a-86d7-2ebad66107a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Split-Brain risk is not related to latency, it can happen on any network
which is dynamic.

The main issue is latency, yes. This is a killer. If latency is too high,
real-time systems can be seen as unusable from a user perspective.

Second issue is network bandwith. LAN traffic is a magnitude faster than
WAN traffic.

Another issue is timing. ES does not have vector clocks yet. That means, a
node is not aware of a local time and a global time, instead, the cluster
assumes all nodes share the same clock. As a consequence, the ES code for
indexing and search is relatively easy to maintain (it is assumed the
causality rule "I write first, then you can read next" is never broken). In
a distributed system, this rule is no longer 100% true when reads and
writes are intertwined and forwarded to other nodes, and missing
coordination can lead to all kind of strange effects (hangs, missing data,
wrong data, conflicts, just to name a few). These effects are expected to
be more frequent on a cluster that spans DCs. I expect ES 2.0 will be a
step forward, it seems it will introduce sequence numbers for operations,
and probably a distributed clock.

With snapshot/restore, data can already be transported between two ES
clusters in two DCs, By paying the price of lagging behind a leading
cluster, another cluster can be set up as a follower cluster quite easily,
keeping latency low, and working around the timing challenge.

Jörg

On Tue, Apr 14, 2015 at 8:53 AM, AndrewK kenworthyas@gmail.com wrote:

I am aware that cross data-center clusters are not recommended, since they
violate one of the core assumptions of ES, namely that all nodes are equal.
But what specifically (apart from obvious problems associated with
network failure) can this lead too: is it just high or "irregular" latency
and the difficulty in debugging issues when node-reponse times are unequal,
or can more critical issues such as split brain also arise from this?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a8fe93fb-dfd1-414a-86d7-2ebad66107a4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a8fe93fb-dfd1-414a-86d7-2ebad66107a4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHCEF62XewV-DCihdnMwTvPpWV1zAWiBqP_m6jFoVyU_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.