Elasticsearch scenarios HA


(davor.sharic) #1

Hello. I am testing elasticsearch and I'm plannig to put it in production.
I have read the manual on elasticsearch and also google helped a lot on
figuring out how it works. So,
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_scale_horizontally.html
here is the basic scenario on how ES is behaving in H-A, failover, what is
happening with shards and etc. Lets presume, I have 3 nods, node1 is
master, node2 and node3 are data nods and both are eligible to become
master. The settings they all have is discovery.zen.minimum_master_nodes: 2.

If I kill the node1 master(shutdown server or just presume is a network
problem), then one of the other nods would become the master. On the link I
posted we can se that the replica shards of missing primary shards that
were on node1(master) would now be turned into primary. So, what happens if
I turn on node1(previous master) again which I killed? Node1 has a setting
in elasticsearch.xml that he is a master - node.master: true. Will then
node1 become the new master? What whould be with primary shards that he has
in his data? And what would then be with primary shards that where
converted from replicas when master was killed at first, and now those
primary shards are in other nodes? Would that cause the split-brain problem?

Which brings us to how do you reboot server which is a master node in ES
cluster without other nods becoming a master during that period? Should I
shutdown the whole cluster and than do the maintenance of servers or can it
be somehow put in maintenance mode? I mean, if I have Debian servers and
every one of them is a node in ES, and some are master eligible how to you
patch them and reboot them?

And one more question, in another scenario, if I kill a node, the primary
shards which that node held are now not available. The master then decides
which replicas on other nodes are going to become primary shards. And now I
bring that same node back, it will have primary shards which it had before
I killed it. And cluster allready have converted replica shards to primary
shards that are missing... What would happen with the node that was killed
and now brought back and what would happen whit his shards?

Tnx.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b55d2abc-12e6-40c1-8cac-cf5be2ed0353%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

Node1 will rejoin the cluster and the other nodes will tell it they have
already voted on a new master which will override it from taking control.
node.master: true just means it is master eligible. The rest of cluster
nodes need to agree, via election, which node will be the active master.
Your other questions is answered below.

For the second part, you don't, you let ES elect a new master. Unless you
specifically want this to happen in which case you would have to take the
whole cluster down, which defeats the HA nature of ES. But you may have
reasons for wanting to do this.

And the last question, it usually just discards the shards it has and then
the cluster will rebalance from shards that it knows are good (this also
applies to the last few questions in your first par)t. You can tell ES to
not rebalance shards if some disappear - see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html-
which can be useful if you need to restart a node.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 14 May 2014 20:50, davor.sharic@gmail.com wrote:

Hello. I am testing elasticsearch and I'm plannig to put it in production.
I have read the manual on elasticsearch and also google helped a lot on
figuring out how it works. So,
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_scale_horizontally.htmlhere is the basic scenario on how ES is behaving in H-A, failover, what is
happening with shards and etc. Lets presume, I have 3 nods, node1 is
master, node2 and node3 are data nods and both are eligible to become
master. The settings they all have is discovery.zen.minimum_master_nodes: 2.

If I kill the node1 master(shutdown server or just presume is a network
problem), then one of the other nods would become the master. On the link I
posted we can se that the replica shards of missing primary shards that
were on node1(master) would now be turned into primary. So, what happens if
I turn on node1(previous master) again which I killed? Node1 has a setting
in elasticsearch.xml that he is a master - node.master: true. Will then
node1 become the new master? What whould be with primary shards that he has
in his data? And what would then be with primary shards that where
converted from replicas when master was killed at first, and now those
primary shards are in other nodes? Would that cause the split-brain problem?

Which brings us to how do you reboot server which is a master node in ES
cluster without other nods becoming a master during that period? Should I
shutdown the whole cluster and than do the maintenance of servers or can it
be somehow put in maintenance mode? I mean, if I have Debian servers and
every one of them is a node in ES, and some are master eligible how to you
patch them and reboot them?

And one more question, in another scenario, if I kill a node, the primary
shards which that node held are now not available. The master then decides
which replicas on other nodes are going to become primary shards. And now I
bring that same node back, it will have primary shards which it had before
I killed it. And cluster allready have converted replica shards to primary
shards that are missing... What would happen with the node that was killed
and now brought back and what would happen whit his shards?

Tnx.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b55d2abc-12e6-40c1-8cac-cf5be2ed0353%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/b55d2abc-12e6-40c1-8cac-cf5be2ed0353%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aPqA_ATf3rCfkGS0Zv7-3mgFgdHr_d93Ws_JgjN18d4w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(davor.sharic) #3

Tnx for the answer. Idea is to have 3 nodes, all to be master eligible. And
with discovery.zen.minimum_master_nodes: 2 I think it will be ok to avoid
split brain problem. I could also reboot the nodes for ie. one year
maintenance and the cluster would just switch roles, I should just wait for
it to do syncs between reboots of the nods. I think.

Other idea would be 4 nodes: 2(data only), 2(eligible master and no data).
Could I then use discovery.zen.minimum_master_nodes:1 and still avoid split
brain problem? I mean,if a master gets disconnected because of a network
issue than the other node which is master eligible will become master and
the cluster works. But, because of the setting
discovery.zen.minimum_master_nodes:1, would the disconnected master form
his own cluster of just himself with the same name and would we have then a
split brain problem? Maybe it wouldn't because there are no data nods
available to him and after the network comes back the disconnected master
would just become a regular node which is master eligible?
I could use discovery.zen.minimum_master_nodes:2 but with 2 masters I
wouldn't get HA, if one fails the whole cluster is in fail state. If this
is all true, than the best scenario should be: 2 data nodes and 3 master
eligible nodes?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1eafa51f-466f-495b-8d2f-742cf8bdfdc6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #4

With 2 mater nodes and discovery.zen.minimum_master_nodes: 1 you still risk
a split brain.
If they master lose connectivity for whatever reason, they will each form
their own cluster, and if the data nodes happen to be talking to only one
of the master eligible nodes at that time then you will have two clusters,
each with a master and a data node.

If you want HA as you are putting it then you want 3 master capable nodes
with discovery.zen.minimum_master_nodes: 2.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 14 May 2014 22:20, davor.sharic@gmail.com wrote:

Tnx for the answer. Idea is to have 3 nodes, all to be master eligible.
And with discovery.zen.minimum_master_nodes: 2 I think it will be ok to
avoid split brain problem. I could also reboot the nodes for ie. one year
maintenance and the cluster would just switch roles, I should just wait for
it to do syncs between reboots of the nods. I think.

Other idea would be 4 nodes: 2(data only), 2(eligible master and no data).
Could I then use discovery.zen.minimum_master_nodes:1 and still avoid
split brain problem? I mean,if a master gets disconnected because of a
network issue than the other node which is master eligible will become
master and the cluster works. But, because of the setting
discovery.zen.minimum_master_nodes:1, would the disconnected master form
his own cluster of just himself with the same name and would we have then a
split brain problem? Maybe it wouldn't because there are no data nods
available to him and after the network comes back the disconnected master
would just become a regular node which is master eligible?
I could use discovery.zen.minimum_master_nodes:2 but with 2 masters I
wouldn't get HA, if one fails the whole cluster is in fail state. If this
is all true, than the best scenario should be: 2 data nodes and 3 master
eligible nodes?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1eafa51f-466f-495b-8d2f-742cf8bdfdc6%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/1eafa51f-466f-495b-8d2f-742cf8bdfdc6%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bBBtRSDtQAdyofpFjdzAZxg5VFWxkcR%3DDaRJf7kiFmLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(davor.sharic) #5

Tnx, got it :slight_smile:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/45941719-c1c9-41d4-b33d-264507c6834f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6