Split brain due to 'on the fence' network partition

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.

  1. To start with all nodes are connected in the cluster. A is the master.
  2. For some reason, node A and B cannot talk to each other. but both
    can still talk to C and C can talk to A and B i.e. a 'on the
    fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service ] [A]
    removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],},
    reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<
    ipaddress>:9300]]), reason failed to ping, tried [3] times, each with
    maximum [30s] timeout
  • B:*[2013-11-17 20:25:27,543][INFO ][discovery.ec2 ] [B]
    master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]],
    reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service ] [B] master
    {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]], previous [
    A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]}, removed {[A
    ][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],}, reason:
    zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress

    :9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service ] [C]
    removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],},
    reason: zen-disco-receive(from master [[A
    ][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a
master, because A can still see C so has the minimum master node
criteria satisfied.

When I ask B for it's state it responds stating that it is a master with
C.

When I ask A for it's state it responds stating that it is a master with
C.

When I ask C for it's state it responds with the same cluster state as A
.

This can be replicated by setting up three nodes (settings above), then
once a master has been established drop the connection between it and what
you assume will be the next master (usually the next node in the list after
the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once
two masters are established remove the iptables entries (running iptables
-F on A and B). From what I understand node discovery only happens when
a node is starting up or does not belong to a cluster, so as these nodes do
belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and
1.0.0.Beta1.zip of elasticsearch with no luck. I was using the
elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version
0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and
1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case
I value availability.

Any help would be greatly appreciated.

Kind Regards,

Mark Tinsley

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I think you should open an issue in elasticsearch project with that excellent description you wrote.
Don't know how it could be fixed though.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 20 novembre 2013 at 10:52:11, Mark Tinsley (markctinsley@gmail.com) a écrit:

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name:
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups:
cloud.aws.region :
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.
To start with all nodes are connected in the cluster. A is the master.
For some reason, node A and B cannot talk to each other. but both can still talk to C and C can talk to A and B i.e. a 'on the fence' network partition as C can still see all:
A:[2013-11-17 20:23:28,257][INFO ][cluster.service ] [A] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]],}, reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]]), reason failed to ping, tried [3] times, each with maximum [30s] timeout
B:[2013-11-17 20:25:27,543][INFO ][discovery.ec2 ] [B] master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/:9300]]], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2013-11-17 20:25:27,547][INFO ][cluster.service ] [B] master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]], previous [A][O25rauSQR7utohD0jg4RQw][inet[/:9300]]}, removed {[A][O25rauSQR7utohD0jg4RQw][inet[/:9300]],}, reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/:9300]])
C: [2013-11-17 20:23:28,256][INFO ][cluster.service ] [C] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]],}, reason: zen-disco-receive(from master [[A][O25rauSQR7utohD0jg4RQw][inet[/:9300]]])
As you can see B is now a new master but A has not been removed as a master, because A can still see C so has the minimum master node criteria satisfied.

When I ask B for it's state it responds stating that it is a master with C.

When I ask A for it's state it responds stating that it is a master with C.

When I ask C for it's state it responds with the same cluster state as A.

This can be replicated by setting up three nodes (settings above), then once a master has been established drop the connection between it and what you assume will be the next master (usually the next node in the list after the master). I used the following commands:

On the master node (A): iptables -A INPUT -s -j DROP

On the next node (B): iptables -A INPUT -s -j DROP

This should get you in the same state that I have witnessed in aws, once two masters are established remove the iptables entries (running iptables -F on A and B). From what I understand node discovery only happens when a node is starting up or does not belong to a cluster, so as these nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and 1.0.0.Beta1.zip of elasticsearch with no luck. I was using the elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version 0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and 1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case I value availability.

Any help would be greatly appreciated.

Kind Regards,

Mark Tinsley

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

this issue is already reported here:

no solution though.

On Wed, Nov 20, 2013 at 4:36 PM, David Pilato david@pilato.fr wrote:

I think you should open an issue in elasticsearch project with that
excellent description you wrote.
Don't know how it could be fixed though.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 20 novembre 2013 at 10:52:11, Mark Tinsley (markctinsley@gmail.com//markctinsley@gmail.com)
a écrit:

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.

  1. To start with all nodes are connected in the cluster. A is the
    master.
  2. For some reason, node A and B cannot talk to each other. but
    both can still talk to C and C can talk to A and B i.e. a 'on
    the fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service ] [A]
    removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],},
    reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<
    ipaddress>:9300]]), reason failed to ping, tried [3] times, each
    with maximum [30s] timeout
  • B:*[2013-11-17 20:25:27,543][INFO ][discovery.ec2 ] [B]
    master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]],
    reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service ] [B]
    master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],
    previous [A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]},
    removed {[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],},
    reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<
    ipaddress>:9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service ] [C]
    removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],},
    reason: zen-disco-receive(from master [[A
    ][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a
master, because A can still see C so has the minimum master node
criteria satisfied.

When I ask B for it's state it responds stating that it is a master
with C.

When I ask A for it's state it responds stating that it is a master
with C.

When I ask C for it's state it responds with the same cluster state as
A.

This can be replicated by setting up three nodes (settings above), then
once a master has been established drop the connection between it and what
you assume will be the next master (usually the next node in the list after
the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j
DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once
two masters are established remove the iptables entries (running iptables
-F on A and B). From what I understand node discovery only happens
when a node is starting up or does not belong to a cluster, so as these
nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and
1.0.0.Beta1.zip of elasticsearch with no luck. I was using the
elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version
0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and
1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use
case I value availability.

Any help would be greatly appreciated.

Kind Regards,

Mark Tinsley

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ha thanks Leonardo!

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 20 novembre 2013 at 16:50:50, Leonardo Menezes (leonardo.menezess@gmail.com) a écrit:

this issue is already reported here: https://github.com/elasticsearch/elasticsearch/issues/2488

no solution though.

On Wed, Nov 20, 2013 at 4:36 PM, David Pilato david@pilato.fr wrote:
I think you should open an issue in elasticsearch project with that excellent description you wrote.
Don't know how it could be fixed though.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 20 novembre 2013 at 10:52:11, Mark Tinsley (markctinsley@gmail.com) a écrit:

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name:
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups:
cloud.aws.region :
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.
To start with all nodes are connected in the cluster. A is the master.
For some reason, node A and B cannot talk to each other. but both can still talk to C and C can talk to A and B i.e. a 'on the fence' network partition as C can still see all:
A:[2013-11-17 20:23:28,257][INFO ][cluster.service ] [A] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]],}, reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]]), reason failed to ping, tried [3] times, each with maximum [30s] timeout
B:[2013-11-17 20:25:27,543][INFO ][discovery.ec2 ] [B] master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/:9300]]], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2013-11-17 20:25:27,547][INFO ][cluster.service ] [B] master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]], previous [A][O25rauSQR7utohD0jg4RQw][inet[/:9300]]}, removed {[A][O25rauSQR7utohD0jg4RQw][inet[/:9300]],}, reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/:9300]])
C: [2013-11-17 20:23:28,256][INFO ][cluster.service ] [C] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/:9300]],}, reason: zen-disco-receive(from master [[A][O25rauSQR7utohD0jg4RQw][inet[/:9300]]])
As you can see B is now a new master but A has not been removed as a master, because A can still see C so has the minimum master node criteria satisfied.

When I ask B for it's state it responds stating that it is a master with C.

When I ask A for it's state it responds stating that it is a master with C.

When I ask C for it's state it responds with the same cluster state as A.

This can be replicated by setting up three nodes (settings above), then once a master has been established drop the connection between it and what you assume will be the next master (usually the next node in the list after the master). I used the following commands:

On the master node (A): iptables -A INPUT -s -j DROP

On the next node (B): iptables -A INPUT -s -j DROP

This should get you in the same state that I have witnessed in aws, once two masters are established remove the iptables entries (running iptables -F on A and B). From what I understand node discovery only happens when a node is starting up or does not belong to a cluster, so as these nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and 1.0.0.Beta1.zip of elasticsearch with no luck. I was using the elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version 0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and 1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case I value availability.

Any help would be greatly appreciated.

Kind Regards,

Mark Tinsley

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks for the replies, I'll take a look at elasticsearch-zookeeper solution

Cheers,

On Wednesday, November 20, 2013 9:52:07 AM UTC, Mark Tinsley wrote:

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.

  1. To start with all nodes are connected in the cluster. A is the
    master.
  2. For some reason, node A and B cannot talk to each other. but
    both can still talk to C and C can talk to A and B i.e. a 'on
    the fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service ] [A]
    removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],},
    reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<
    ipaddress>:9300]]), reason failed to ping, tried [3] times, each
    with maximum [30s] timeout
  • B:*[2013-11-17 20:25:27,543][INFO ][discovery.ec2 ] [B]
    master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]],
    reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service ] [B]
    master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],
    previous [A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]},
    removed {[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],},
    reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<
    ipaddress>:9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service ] [C]
    removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],},
    reason: zen-disco-receive(from master [[A
    ][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a
master, because A can still see C so has the minimum master node
criteria satisfied.

When I ask B for it's state it responds stating that it is a master
with C.

When I ask A for it's state it responds stating that it is a master
with C.

When I ask C for it's state it responds with the same cluster state as
A.

This can be replicated by setting up three nodes (settings above), then
once a master has been established drop the connection between it and what
you assume will be the next master (usually the next node in the list after
the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j
DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once
two masters are established remove the iptables entries (running iptables
-F on A and B). From what I understand node discovery only happens
when a node is starting up or does not belong to a cluster, so as these
nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and
1.0.0.Beta1.zip of elasticsearch with no luck. I was using the
elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version
0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and
1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use
case I value availability.

Any help would be greatly appreciated.

Kind Regards,

Mark Tinsley

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.