Clarification on Recovery settings please


(Chris Neal) #1

Hi all,

I would like some clarification on the various cluster recovery settings, just to make sure I'm understanding them correctly.

I have a 6 node cluster, split like this:
3 master only
2 data only
1 client only

Currently, I have these settings:

gateway:
  expected_nodes: 6
  recover_after_data_nodes: 1
  recover_after_master_nodes: 1

What I'm not sure about is if the recover_after_data_nodes and recover_after_master_nodes are correct. What I think I'm telling ES is that the cluster has 6 nodes, but it's ok to begin recovery after either 1 master or 1 data node are up.

Is this appropriate? Or should I wait to recover until, say, all masters are up? Or all data nodes are up? It wasn't clear to me in the documentation if one was preferable over the other!

Many thanks,
Chris


(Mark Walkom) #2

This page should give you a bit more guidance - https://www.elastic.co/guide/en/elasticsearch/guide/current/_important_configuration_changes.html#_recovery_settings

Let us know if you have more questions though :slight_smile:


(Chris Neal) #3

Thank you again Mark :smile:

That did make it more clear. To summarize, I could provide (for a 10 node cluster of 1 client only, 3 master only, and 6 data only, with 6 shards and 1 replica for indexing strategy):

gateway.recover_after_nodes: 6

To say "Don't start a recovery until at least 6 data and/or master nodes are in the cluster."
Then, for more fine-grained control of the types of nodes you want present in the cluster, you can also provide:

gateway.recover_after_data_nodes: 5
gateway.recover_after_master_nodes: 1

But does that say "Those 6 nodes must be made up of 1 master and 5 data nodes."
Or, "When either 1 master node or 5 data nodes are present in the cluster, start a recovery."

I think what I would like to say is, "Start a recovery when you have a master, and you have 5 of the 6 data nodes online."

Is that possible, or necessary? I just didn't want the cluster to see 1 master, thereby satisfying that condition, and start a recovery if there were not enough data nodes online to prevent an unnecessary shuffling of shards!

Thanks again for the assistance.
Chris


(Mark Walkom) #4

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html#recover-after explains that.

It'll wait for that number of master and data nodes, though you really want to use one or the other.


(system) #5