I am now studying for elasticsearch and reading the definitive guide on-line.
While I was reading the document, I came across to a part where there seems to be a contradiction.
What this means is that Elasticsearch will do the following:
Wait for eight nodes to be present
Begin recovering after 5 minutes or after ten nodes have joined the cluster, whichever comes first.
However, in the sample configuration it is setting 10
So why bother with the recover_after_nodes if you also have the expected_nodes set for it to wait for anyway you ask?
This is where the recover_after_time setting comes in.
With these 3 settings configured as above, recovery will commence after one of the two following scenarios is met:
either:
a) 10 expected nodes are present
or
b) at least 8 nodes are present in the cluster and 5 minutes has elapsed since cluster start.
Why?
This allows you to recover immediately if all 10 nodes come back within 5 minutes OR allows you to start recovery if something is wrong with a couple of nodes but you have at least 8.
You would adjust the recover_after_nodes setting to possibly match up with how many replicas you have that would enable you to get a cluster up and running with all primary shards allocated to get to at least yellow state and get the cluster running. Your number of replicas and nodes will dictate how many nodes need to be in the cluster for all primary shards to be allocated successfully.
Let me know if you need more info with some replica examples with this setting.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.