Hi,
I am quite new to ES and currently doing some experiments to evaluate it. I
am trying to figure following case:
Assume That
we started two nodes on two different PC
We create a index with 2 shards and 1 replicas
One of the primary shard goes to first node and other goes to second
node
Two application writing data to these nodes. And also assume that
applications run on two different PC.
If the connection between these nodes goes away each ES nodes will
declare their replicas as primary and application will continue write the
data to their local nodes
Now, we have two nodes with different data on each
Now, I want to ask some questions:
What will happen if connection comes back? Will the nodes discover each
other? (I read from user list that one of them should be restarted to
rediscover each other)
What will happen if I restart one of the nodes after connection comes back?
What will happen to data written on restarted node?
1. Now, we have two nodes with different data on each
This is known as a split brain.
What will happen if connection comes back? Will the nodes discover
each other? (I read from user list that one of them should be
restarted to rediscover each other)
That is correct
What will happen if I restart one of the nodes after connection comes
back? What will happen to data written on restarted node?
The data on the other node will be lost and will need to be reindexed.
Currently there is no way of merging these two clusters back together.
This issue in master allows you to configure your cluster to reduce the
likelihood of this situation, by preventing writes unless a quorum of
nodes is present:
This doesn't solve the issue, but at least will make you aware that
there IS an issue, because your writes will start failing.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.