Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other and
assume the responsibility of master.
When network is restored, they don't ping each other and form a cluster.
Elasticsearch service has to be restarted on any one of the nodes for them
to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.
Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other and
assume the responsibility of master.
When network is restored, they don't ping each other and form a cluster.
Elasticsearch service has to be restarted on any one of the nodes for them
to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.
After network goes down, they loose communication with each other. After
that, they are becoming split.
They both think they are masters. Even if they think they are masters,
shouldn't the ping happen to see if there are other nodes in the cluster ?
Number of replicas is set to 1. If ES doesn't differentiate, why are
some shards primary and others replica ?
On Monday, 4 May 2015 10:48:24 UTC+5:30, Mark Walkom wrote:
Why are they becoming split anyway? GC, other load, network?
Not if they both think they are masters.
Are you running replicas? If so ES doesn't really differentiate
between the two.
On 4 May 2015 at 15:03, Gourav H Dhelaria <gouravd...@gmail.com
<javascript:>> wrote:
Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other and
assume the responsibility of master.
When network is restored, they don't ping each other and form a cluster.
Elasticsearch service has to be restarted on any one of the nodes for
them to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.
Your nodes aren't in different DCs are they? If so this is why we don't
support such setups, because ES is latency sensitive and these sorts of
things can happen very easily when your network is unreliable.
They don't try to ping other nodes because you only have two, and if they
lose contact with one another then they both assume they are masters and
create their own cluster. Masters don't ping other nodes at random and see
if they should be joining a different cluster.
Logically there is no difference between a primary and a replica shard, the
only physical difference is a flag that tells the cluster state which is
which. This is why ES will never assign a primary and it's applicable
replica to the same node.
You cannot get around the root of your problem unless you add another node
to and set min masters to ensure a majority quorum.
Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other and
assume the responsibility of master.
When network is restored, they don't ping each other and form a cluster.
Elasticsearch service has to be restarted on any one of the nodes for
them to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.
Your nodes aren't in different DCs are they? If so this is why we don't
support such setups, because ES is latency sensitive and these sorts of
things can happen very easily when your network is unreliable.
They don't try to ping other nodes because you only have two, and if they
lose contact with one another then they both assume they are masters and
create their own cluster. Masters don't ping other nodes at random and see
if they should be joining a different cluster.
Logically there is no difference between a primary and a replica shard,
the only physical difference is a flag that tells the cluster state which
is which. This is why ES will never assign a primary and it's applicable
replica to the same node.
You cannot get around the root of your problem unless you add another node
to and set min masters to ensure a majority quorum.
Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other and
assume the responsibility of master.
When network is restored, they don't ping each other and form a cluster.
Elasticsearch service has to be restarted on any one of the nodes for
them to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.
Looks like the only way around this would be to add more nodes and set
minimum masters to ensure a majority quorum.
Thanks.
Gourav
On Monday, 4 May 2015 12:02:27 UTC+5:30, Jason Wee wrote:
why must you have only two nodes, would it be possible to add one more
nodes so split brain will not become an issue?
jason
On Mon, May 4, 2015 at 2:20 PM, Mark Walkom <markw...@gmail.com
<javascript:>> wrote:
Your nodes aren't in different DCs are they? If so this is why we don't
support such setups, because ES is latency sensitive and these sorts of
things can happen very easily when your network is unreliable.
They don't try to ping other nodes because you only have two, and if they
lose contact with one another then they both assume they are masters and
create their own cluster. Masters don't ping other nodes at random and see
if they should be joining a different cluster.
Logically there is no difference between a primary and a replica shard,
the only physical difference is a flag that tells the cluster state which
is which. This is why ES will never assign a primary and it's applicable
replica to the same node.
You cannot get around the root of your problem unless you add another
node to and set min masters to ensure a majority quorum.
On 4 May 2015 at 15:27, Gourav H Dhelaria <gouravd...@gmail.com
<javascript:>> wrote:
After network goes down, they loose communication with each other.
After that, they are becoming split.
They both think they are masters. Even if they think they are
masters, shouldn't the ping happen to see if there are other nodes in the
cluster ?
Number of replicas is set to 1. If ES doesn't differentiate, why are
some shards primary and others replica ?
On Monday, 4 May 2015 10:48:24 UTC+5:30, Mark Walkom wrote:
Why are they becoming split anyway? GC, other load, network?
Not if they both think they are masters.
Are you running replicas? If so ES doesn't really differentiate
between the two.
Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other
and assume the responsibility of master.
When network is restored, they don't ping each other and form a
cluster.
Elasticsearch service has to be restarted on any one of the nodes for
them to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.
In non "big data" scenarios, having two servers for a database is simply
done to achieve high availability. Most databases use a master client
scenario, but Elasticsearch does not support such a setup. It really should
because not everyone has tons of data.
Ivan, not affiliated with the OP
On May 4, 2015 8:32 AM, "Jason Wee" peichieh@gmail.com wrote:
why must you have only two nodes, would it be possible to add one more
nodes so split brain will not become an issue?
Your nodes aren't in different DCs are they? If so this is why we don't
support such setups, because ES is latency sensitive and these sorts of
things can happen very easily when your network is unreliable.
They don't try to ping other nodes because you only have two, and if they
lose contact with one another then they both assume they are masters and
create their own cluster. Masters don't ping other nodes at random and see
if they should be joining a different cluster.
Logically there is no difference between a primary and a replica shard,
the only physical difference is a flag that tells the cluster state which
is which. This is why ES will never assign a primary and it's applicable
replica to the same node.
You cannot get around the root of your problem unless you add another
node to and set min masters to ensure a majority quorum.
Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other
and assume the responsibility of master.
When network is restored, they don't ping each other and form a
cluster.
Elasticsearch service has to be restarted on any one of the nodes for
them to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.