Split Brain and ES node in multiple cluster

Hi,

I've a 3 node ES cluster and I've set discovery.zen.minimum_master_nodes =
2 and node.max_local_storage_nodes = 1. These nodes are deployed on NFS so
that in case a machine is fails, an ES process can be started on the same
data path from a different machine. Each machine has access to data
directories for all es nodes.

The Initial deployment was:
ES1 - Node1:9300 (path.data=/nfs/node1)
ES2 - Node2:9300 (path.data=/nfs/node2)
ES3 - Node3:9300 (path.data=/nfs/node3)

Then I took the following steps:

  1. Killed the network on Node2, which resulted in a cluster with two nodes
    (ES1,ES3). ES2 process was still running on Node2.

  2. Started new es process on Node1 with path.data=/nfs/node2. I was
    assuming since node.max_local_storage_nodes = 1, it will not start as ES2
    on Node2:9300 already has lock on it, but it started anyways. The cluster
    now looked like
    ES1 - Node1:9300 (path.data=/nfs/node1)
    ES2 - Node1:9301 (path.data=/nfs/node2)
    ES3 - Node3:9300 (path.data=/nfs/node3)

There was alos ES2 - Node2:9300 (path.data=/nfs/node2) still running but
not part of the cluster.

  1. I started the network on Node2 which resulted in following two clusters:
    Cluster1:
    ES1 - Node1:9300 (path.data=/nfs/node1)
    ES2 - Node1:9301 (path.data=/nfs/node2)
    ES3 - Node3:9300 (path.data=/nfs/node3)

Cluster2:
ES1 - Node1:9300 (path.data=/nfs/node1)
ES2 - Node2:9300 (path.data=/nfs/node2)

Now, Node1:9300 is participating in both the clusters which doesn't seems
right to me.

Is there any way to restrict participation of an ES node to a single
cluster? Also, can I specify a timeout somewhere after which an ES node
will die if minimum no. of master nodes are not reachable?

Anand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Why don't you just use path.data=/nfs ?

You can give your cluster a name in the node config, but I'm sure this is
not what you are looking for.

There is no proven fault tolerance against network connection failures
between nodes in ES, only against node failures.

There is no timeout in minimum_master_nodes because the idea is waiting for
a number of nodes being connected to each other before a leader (master) is
elected. What shall happen after the timeout, except more waiting?

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Jörg,

Please find my answers inline.

Thanks,
Anand

On Wednesday, 28 August 2013 16:58:44 UTC+5:30, Jörg Prante wrote:

Why don't you just use path.data=/nfs ?

A. I'm not using /nfs as path.data as I want finer control over which 

directory is used on a particular machine..

You can give your cluster a name in the node config, but I'm sure this is
not what you are looking for.

A. The problem is that now I've two clusters with the same name with 

one cluster in green state. Worse is that I've one node participating in
two clusters. Is there any way of preventing that?

There is no proven fault tolerance against network connection failures
between nodes in ES, only against node failures.

There is no timeout in minimum_master_nodes because the idea is waiting
for a number of nodes being connected to each other before a leader
(master) is elected. What shall happen after the timeout, except more
waiting?

A. The process can die after not finding any eligible masters after 

waiting for a given time interval.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Using /nfs would give you the control you need, there are already
subfolders for each node.

The situation of participating in two clusters is not possible for a node -
there would be strong conflicts in the internal state because per node
there is only one cluster state. It is possible that two masters share the
node in their cluster states. This is called a "split brain", the problem
is not the node, but the two masters.

If the process dies instead of waiting, you would have to monitor and
restart all nodes over and over again in a large cluster while coming up
slowly, which is contrary to the aim of minimum_master_nodes.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.