Sicker
(Sicker)
May 30, 2013, 4:56am
1
I found problem that one node can see all 4 node but another node cannot
see this node like the head page.
The number of the data is not equals between them. Can I restore all the
data?
https://lh4.googleusercontent.com/-sYKTD0FPLsk/UabbxdTEPiI/AAAAAAAAA60/5f1m8fTD4ok/s1600/Node2.jpg
https://lh6.googleusercontent.com/-j5FEk9IZg7U/Uabbs-ds-xI/AAAAAAAAA6s/PsgehI2kfJA/s1600/Node1.jpg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
Ivan
(Ivan Brusic)
May 30, 2013, 2:45pm
2
All the nodes can probably see each other, your issue is that you have a
split-brain, probably caused when one of the nodes was momentarily
unresponsive (> timeout). You have two clusters with two separate master
nodes. I would try restarting Blind Justice (xxx.xxx.xxx.208). Your data
appears to be intact, however you will have duplicate data. After the
cluster has reformed, look into your path.data directory for two nodes
(nodes/0 and nodes/1) and delete the one not being updated.
You should set discovery.zen.minimum_master_nodes to 3 (4/2+1), but that
still does not ensure split brain will not happen when a node is removed.
--
Ivan
On Wed, May 29, 2013 at 9:56 PM, Sicker sicker27@gmail.com wrote:
I found problem that one node can see all 4 node but another node cannot
see this node like the head page.
The number of the data is not equals between them. Can I restore all the
data?
https://lh4.googleusercontent.com/-sYKTD0FPLsk/UabbxdTEPiI/AAAAAAAAA60/5f1m8fTD4ok/s1600/Node2.jpg
https://lh6.googleusercontent.com/-j5FEk9IZg7U/Uabbs-ds-xI/AAAAAAAAA6s/PsgehI2kfJA/s1600/Node1.jpg
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
Le jeudi 30 mai 2013 16:45:25 UTC+2, Ivan Brusic a écrit :
All the nodes can probably see each other, your issue is that you have a
split-brain, probably caused when one of the nodes was momentarily
unresponsive (> timeout). You have two clusters with two separate master
nodes. I would try restarting Blind Justice (xxx.xxx.xxx.208). Your data
appears to be intact, however you will have duplicate data. After the
cluster has reformed, look into your path.data directory for two nodes
(nodes/0 and nodes/1) and delete the one not being updated.
You should set discovery.zen.minimum_master_nodes to 3 (4/2+1), but that
still does not ensure split brain will not happen when a node is removed.
For my own knowledge why it does not ensure split brain will not happen ?
--
Ivan
On Wed, May 29, 2013 at 9:56 PM, Sicker <sick...@gmail.com <javascript:>>wrote:
I found problem that one node can see all 4 node but another node cannot
see this node like the head page.
The number of the data is not equals between them. Can I restore all the
data?
https://lh4.googleusercontent.com/-sYKTD0FPLsk/UabbxdTEPiI/AAAAAAAAA60/5f1m8fTD4ok/s1600/Node2.jpg
https://lh6.googleusercontent.com/-j5FEk9IZg7U/Uabbs-ds-xI/AAAAAAAAA6s/PsgehI2kfJA/s1600/Node1.jpg
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out .
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
Ivan
(Ivan Brusic)
May 30, 2013, 3:31pm
4
See:
opened 06:52AM - 25 Jul 12 UTC
closed 02:09AM - 16 Jun 14 UTC
>bug
## Summary:
Split brain can occur on the second network disconnect of a node, w… hen the minimum_master_nodes is configured correctly(n/2+1). The split brain occurs if the nodeId(UUID) of the disconnected node is such that the disconnected node picks itself as the next logical master while pinging the other nodes(NodeFaultDetection). The split brain only occurs on the second time that the node is disconnected/isolated.
## Detail:
Using ZenDiscovery, Node Id's are randomly generated(A UUID): ZenDiscovery:169.
When the node is disconnected/isolated it the ElectMasterService uses an ordered list of the Nodes (Ordered by nodeId) to determine a new potential master. It picks the first of the ordered list: ElectMasterService:95
Because the nodeId's are random, it's possible for the disconnected/isolated node to be first in the ordered list, electing itself as a possible master.
The first time network is disconnected, the minimum_master_nodes property is honored and the disconnected/isolated node goes into a "ping" mode, where it simply tries to ping for other nodes. Once the network is re-connected, the node re-joins the cluster successfully.
The Second time the network is disconnected, the minimum_master_nodes intent is not honored. The disconnected/isolated node fails to realise that it's not connected to the remaining node in the 3 node cluster and elects itself as master, still thinking it's connected.
It feels like there is a failure in the transition between MasterFaultDetection and NodeFaultDetection, because it works the first time!
The fault only occurs if the nodeId is ordered such that the disconnected node picks itself as the master while isolated. If the nodeId's are ordered such that it picks one of the other 2 nodes to be potential master then the isolated node honors the minimum_master_nodes intent every time.
Because the nodeId's are randomly(UUID) generated, the probability of this occuring drops as the number of nodes in the cluster goes up. For our 3 node cluster it's ~50% (with one node detected as gone, it's up to the ordering of the remaining two nodeId's)
Note, While we were trying track this down we found that the cluster.service TRACE level logging (which outputs the cluster state) does not list the nodes in election order. IE, the first node in that printed list is not necessarily going to elected as master by the isolated node.
## Detail Steps to reproduce:
Because the ordering of the nodeId's is random(UUID) we were having trouble getting a consitantly reproducable test case. To fix the ordering, we made a patch to ZenDiscovery to allow us to optionally configure a nodeId. This allowed us to set the nodeId of the disconnected/isolated node to guarantee it's ordering, allowing us to consistently reproduce.
We've tested this scenario on the 0.19.4, 0.19.7, 0.19.8 distributions and see the error when the nodeId's were ordered just right.
We also tested this scenario on the current git master with the supplied patch.
In this scenario, node3 will the be the node we disconnect/isolate. So we start the nodes up in numerical order to ensure node3 doesn't _start_ as master.
1. Configure nodes with attached configs (one is provided for each node)
2. Start up nodes 1 and 2. After they are attached and one is master, start node 3
3. Create a blank index with default shard/replica(5/1) settings
4. Pull network cable from node 3
5. Node 3 detects master has gone (MasterFaultDetection)
6. Node 3 elects itself as master (Because the nodeId's are ordered just right)
7. Node 3 detects the remaining node has gone, enters ZenDiscovery minimum_master_nodes mode, prints a message indicating not enough nodes
8. Node 3 goes into a ping state looking for nodes
9. At this point, node 1 and node 2 report a valid cluster, they know about each other but not about node 3.
10. Reconnect network to node 3
11. Node 3 rejoins the cluster correctly, seeing that there is already a master in the cluster.
At this point, everything is working as expected.
1. Pull network cable from node 3 again
2. Node 3 detects master has gone (MasterFaultDetection)
3. Node 3 elects as itself as master (Because the nodeId's are ordered just right)
4. Node 3 now fails to detect that the remaining node in the cluster is not accessible. It starts throwing a number of Netty NoRouteToHostExceptions about the remaining node.
5. According to node 3, cluster health is yellow and cluster state shows 2 data nodes
6. Reconnect network to node 3
7. Node 3 appears to connect to the node that it thinks it's still connected to. (can see that via the cluster state api). The other nodes log nothing and do not show the disconnected node as connected in any way.
8. Node 3 at this point accepts indexing and search requests, a classic split brain.
Here's a gist with the patch to ZenDiscovery and the 3 node configs.
https://gist.github.com/3174651
opened 08:15AM - 17 Dec 12 UTC
closed 02:58PM - 01 Sep 14 UTC
>bug
v2.0.0-beta1
v1.4.0.Beta1
G'day,
I'm using ElasticSearch 0.19.11 with the unicast Zen discovery protocol.…
With this setup, I can easily split a 3-node cluster into two 'hemispheres' (continuing with the brain metaphor) with one node acting as a participant in both hemispheres. I believe this to be a significant problem, because now `minimum_master_nodes` is incapable of preventing certain split-brain scenarios.
Here's what my 3-node test cluster looked like before I broke it:
![](https://saj.beta.anchortrove.com/es-splitbrain-1.png)
Here's what the cluster looked like after simulating a communications failure between nodes (2) and (3):
![](https://saj.beta.anchortrove.com/es-splitbrain-2.png)
Here's what seems to have happened immediately after the split:
1. Node (2) and (3) lose contact with one another. (`zen-disco-node_failed` ... `reason failed to ping`)
2. Node (2), still master of the left hemisphere, notes the disappearance of node (3) and broadcasts an advisory message to all of its followers. Node (1) takes note of the advisory.
3. Node (3) has now lost contact with its old master and decides to hold an election. It declares itself winner of the election. On declaring itself, it assumes master role of the right hemisphere, then broadcasts an advisory message to all of its followers. Node (1) takes note of this advisory, too.
At this point, I can't say I know what to expect to find on node (1). If I query both masters for a list of nodes, I see node (1) in both clusters.
Let's look at `minimum_master_nodes` as it applies to this test cluster. Assume I had set `minimum_master_nodes` to 2. Had node (3) been completely isolated from nodes (1) and (2), I would not have run into this problem. The left hemisphere would have enough nodes to satisfy the constraint; the right hemisphere would not. This would continue to work for larger clusters (with an appropriately larger value for `minimum_master_nodes`).
The problem with `minimum_master_nodes` is that it does not work when the split brains are intersecting, as in my example above. Even on a larger cluster of, say, 7 nodes with `minimum_master_nodes` set to 4, all that needs to happen is for the 'right' two nodes to lose contact with one another (a master election has to take place) for the cluster to split.
Is there anything that can be done to detect the intersecting split on node (1)?
Would #1057 help?
Am I missing something obvious? :)
On Thu, May 30, 2013 at 8:11 AM, Nicolas Labrot nithril@gmail.com wrote:
For my own knowledge why it does not ensure split brain will not happen ?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .