Hi Bernt.
Here is the more specific scenario, that may happen in production. I have given t0, t1 and t2 scenarios.
-
say at t0 : clustersize=1, so min_masters=clustersize+1/2=1 (which is also 1). And say two nodes are coming up with this config, say n0 and n1, discovery.zen.ping.unicast.hosts: "10.0.0.1,10.0.0.2" where n0=10.0.0.1 and n1=10.0.0.2. output shows GREEN with nodes in the cluster, life is good
bash-4.2# curl https://10.0.0.2:9200/_cluster/health?pretty --key certs/admin/server8.key --cert certs/admin/server.crt --cacert certs/admin/cacert.crt
{
"cluster_name" : "elasticsearch",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 2,
"active_shards" : 4,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
-
Say at t1, say the network connectivity is lost between t0 and t1. And the cluster/health shows 'YELLOW'
bash-4.2# curl https://10.0.0.2:9200/_cluster/health?pretty --key certs/admin/server8.key --cert certs/admin/server.crt --cacert certs/admin/cacert.crt
{
"cluster_name" : "elasticsearch",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 2,
"active_shards" : 2,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 66.66666666666666
}
-
Say at t2, network connectivity is back up between n0 and n1, still it shows Cluster status as YELLOW and functions with 1 node, even though discovery.zen.ping.unicast.hosts: "10.0.0.1,10.0.0.2". I want these two nodes to be part of cluster cuz the min_masters is still 1, so they are continuing to function in YELLOW state. How can I get these two nodes to be join the cluster. Restarting one of the nodes triggers pings and hence it forms, but is there a way without restarting the cluster.
Elastic "version" : "6.1.1",
thanks.
jala