What I'm not sure about is if the recover_after_data_nodes and recover_after_master_nodes are correct. What I think I'm telling ES is that the cluster has 6 nodes, but it's ok to begin recovery after either 1 master or 1 data node are up.
Is this appropriate? Or should I wait to recover until, say, all masters are up? Or all data nodes are up? It wasn't clear to me in the documentation if one was preferable over the other!
That did make it more clear. To summarize, I could provide (for a 10 node cluster of 1 client only, 3 master only, and 6 data only, with 6 shards and 1 replica for indexing strategy):
gateway.recover_after_nodes: 6
To say "Don't start a recovery until at least 6 data and/or master nodes are in the cluster."
Then, for more fine-grained control of the types of nodes you want present in the cluster, you can also provide:
But does that say "Those 6 nodes must be made up of 1 master and 5 data nodes."
Or, "When either 1 master node or 5 data nodes are present in the cluster, start a recovery."
I think what I would like to say is, "Start a recovery when you have a master, and you have 5 of the 6 data nodes online."
Is that possible, or necessary? I just didn't want the cluster to see 1 master, thereby satisfying that condition, and start a recovery if there were not enough data nodes online to prevent an unnecessary shuffling of shards!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.