Cluster has 3 master nodes: I typically bring up 1 as the 'seed' node and then add two others to point to it. Once the nodes have all agreed on who is the boss I kill the 'seed' node and scale up the others to 3.
This morning I wanted to update the config so the non-seed nodes would use persistent storage. The 'seed' master was still in charge so I dropped / restarted the others. Now the 'seed' master won't acknowledge the new wannaba-masters. Errors show that it's looking (in vain) for the original wannabes.
Seems (to me) that there should be a timeout value set on the master to stop looking for other lost masters and permit new entrants. Either that or create a PUT command to do the same.
Note: after 30+ minutes of waiting the errors on the seed-master show that it has found the wannabes but because they aren't in the list of what it's expecting it won't let them join.
Further: messages from the 'wannabes' show that they have found the seed-master but it isn't showing up as eligible.
and this node must discover master-eligible nodes [<seed master>] to bootstrap a cluster:
have discovered <list of nodes including other wannabes _and_ the seed master>...
sigh - now I have to delete the seed-master and rebuild the cluster.. the largest hassle being that all the data/ingest nodes are still pointed at the old (broken) master so the _state folders all have to be tracked down and removed.
Unfortunately this would be unsafe, i.e. would lead to data loss. The only safe thing to do after losing a majority of master nodes is to remain unavailable indefinitely (or until those nodes reappear).
In fact the whole process you describe sounds pretty unsafe. Master-eligible nodes are required to use persistent storage. If they do not then this is the sort of thing that can happen.
RE: _state folder - I can understand why a data node would find a cluster and stick with it but Im not clear on why this value is written to disk. It means that when I rebuild a cluster I have to find / remove this folder otherwise when the data nodes come up they try to (re)find the older cluster-master.
If your master nodes had storage that persists across restarts then none of this would be necessary. Since you've fixed the storage on the masters, you need to rebuild this cluster one last time and then everything should become much simpler.
Every time I go through this I learn / encounter something new.
Today's entertainment - ingest nodes are failing to come up - claim they can't find the master node. (5 data nodes are up and 3 master nodes have agreed with one another)
Interesting side note: had to bring down a data node to try and clear a stuck kibana_2 alias. When it came back up it started giving the same error msg as the ingest node(s).
Im thinking Im going to have to kill the whole cluster again and keep the seed-master up until all the data and ingest nodes are up. Seems like this will potentially give me fits down the road if I have to restart any of the data/ingest nodes.
It's def based on the absence of a seed-master. Once I (re)started that (and subsequently reinitialized the whole cluster) everything resolved and joined properly.
This seems like a bug (to me). Once seed + non-seed masters have formed a coalition nodes (ingest, data, etc) should be able to join regardless of which master node is actually in charge.
Don't believe me? Try it.
Create a seed-master node
Create 2 (or more) non-seed master nodes and join them to the seed-master
Create a non-master node and join it to the cluster - note that it works
Kill the seed-master
Create a non-master node and (attempt to) join it to the cluster - note that it doesn't work
The only way I can think of to reproduce this is by misconfiguring discovery.seed_hosts. You have mentioned this idea of a "seed" node multiple times but you are using this word quite differently from how Elasticsearch uses it. discovery.seed_hosts should refer to all of your master-eligible nodes, but I suspect you are configuring this setting to refer only to the single node that gets killed in step 4. Once that node is dead, the node created in step 5 cannot find the rest of the cluster because it doesn't know any of their addresses.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.