I've a 3 node ES cluster and I've set discovery.zen.minimum_master_nodes =
2 and node.max_local_storage_nodes = 1. These nodes are deployed on NFS so
that in case a machine is fails, an ES process can be started on the same
data path from a different machine. Each machine has access to data
directories for all es nodes.
Killed the network on Node2, which resulted in a cluster with two nodes
(ES1,ES3). ES2 process was still running on Node2.
Started new es process on Node1 with path.data=/nfs/node2. I was
assuming since node.max_local_storage_nodes = 1, it will not start as ES2
on Node2:9300 already has lock on it, but it started anyways. The cluster
now looked like
ES1 - Node1:9300 (path.data=/nfs/node1)
ES2 - Node1:9301 (path.data=/nfs/node2)
ES3 - Node3:9300 (path.data=/nfs/node3)
There was alos ES2 - Node2:9300 (path.data=/nfs/node2) still running but
not part of the cluster.
I started the network on Node2 which resulted in following two clusters:
Cluster1:
ES1 - Node1:9300 (path.data=/nfs/node1)
ES2 - Node1:9301 (path.data=/nfs/node2)
ES3 - Node3:9300 (path.data=/nfs/node3)
Now, Node1:9300 is participating in both the clusters which doesn't seems
right to me.
Is there any way to restrict participation of an ES node to a single
cluster? Also, can I specify a timeout somewhere after which an ES node
will die if minimum no. of master nodes are not reachable?
You can give your cluster a name in the node config, but I'm sure this is
not what you are looking for.
There is no proven fault tolerance against network connection failures
between nodes in ES, only against node failures.
There is no timeout in minimum_master_nodes because the idea is waiting for
a number of nodes being connected to each other before a leader (master) is
elected. What shall happen after the timeout, except more waiting?
On Wednesday, 28 August 2013 16:58:44 UTC+5:30, Jörg Prante wrote:
Why don't you just use path.data=/nfs ?
A. I'm not using /nfs as path.data as I want finer control over which
directory is used on a particular machine..
You can give your cluster a name in the node config, but I'm sure this is
not what you are looking for.
A. The problem is that now I've two clusters with the same name with
one cluster in green state. Worse is that I've one node participating in
two clusters. Is there any way of preventing that?
There is no proven fault tolerance against network connection failures
between nodes in ES, only against node failures.
There is no timeout in minimum_master_nodes because the idea is waiting
for a number of nodes being connected to each other before a leader
(master) is elected. What shall happen after the timeout, except more
waiting?
A. The process can die after not finding any eligible masters after
Using /nfs would give you the control you need, there are already
subfolders for each node.
The situation of participating in two clusters is not possible for a node -
there would be strong conflicts in the internal state because per node
there is only one cluster state. It is possible that two masters share the
node in their cluster states. This is called a "split brain", the problem
is not the node, but the two masters.
If the process dies instead of waiting, you would have to monitor and
restart all nodes over and over again in a large cluster while coming up
slowly, which is contrary to the aim of minimum_master_nodes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.