Has anyone else gone through the fun of bringing up a new cluster in AWS now that we have this new "bootstrapping" stuff in 7.0?
The problem I seem to have is that the cluster wants me to specify the master nodes to bootstrap up the cluster. The issue is, I am bringing up these nodes under an AWS autoscaling group and I really don't know what they are until they are up.
I would have thought that the "discovery.seed_providers: ec2" would have taken care of this for me, but alas... No... it's still wanted to see "cluster.initial_master_nodes" and a list of who can be the first master.
Anyway... just curious if anyone else has already gone through this and can shed some light on how to do this. I guess I could do some user-data magic and build a list of my master nodes and stick them in the elastic config... but that seems like a lot of work... and really a bit hacky...
Perhaps the simplest way to launch a brand-new cluster in an ASG is actually to launch it outside the ASG, with known node names, and then attach the instances to the ASG. Alternatively you can launch the cluster without bootstrapping it in the ASG and then briefly run another Elasticsearch node to do the bootstrapping. I also know of some people who are using an external Zookeeper to orchestrate the node names and the bootstrapping, although it's only really worth doing this if you're launching a lot of new clusters.
It's not recommended to have each instance wait for the other instances to come up using the EC2 APIs before bootstrapping (like discovery-ec2 would). The upper bound on an ASG isn't a guarantee, and the EC2 APIs don't always return the full picture, so it's possible you could get 4 nodes instead of 3 and accidentally bootstrap two clusters.
I've stumbled upon the same issue with bootstrapping 7.0 in AWS. We have infrastructure as a code and every single AWS resource is managed from CloudFormation. And manipulating of ec2 instances or even ASGs outside of CloudFormation to bootstrap a cluster seems to me even more error-prone process.
I guess, with discovery.zen.minimum_master_nodes parameter removed there is no other reliable way for discovery-ec2 plugin to determine the initial number of master eligible nodes. As @DavidTurner mentioned the AWS API might be flaky.
However, the functionality that used to be in the plugin will still need to be workarounded in UserData.
In my experience the start of a new stateful service is always a bit "special", no matter what service it is or how it's managed. After setting up a service for the first time there's normally some other one-time initialisation needed (e.g. creating templates, restoring from snapshots), which is quite different from the normal day-to-day maintenance of a running service.
For the record, I'd also very much like for Elasticsearch to integrate even more tightly with EC2 auto-scaling groups, but as far as we can tell the means to do so correctly currently doesn't exist. Autoscaling works with data nodes, of course, and also works with master-eligible nodes once bootstrapped, but there doesn't seem to be a truly satisfactory solution for bootstrapping right now.
We surveyed a number of other strongly-consistent stateful services during development (e.g. Zookeeper, etcd, Consul) and found these to be in pretty much the same boat. I'm open to ideas.
David, appreciate your response.
I'm pretty sure my idea was already discussed inside the Elastic team, but might worth asking anyway. Would it make sense to add cluster.minimum_initial_master_nodes setting and let the discovery plugins do their job? Once the minimum number of master nodes has discovered the cluster is considered as bootstrapped and the setting is no longer required and ignored.
It doesn't, unfortunately, although we did try this idea. The logic we used was to say that if we expect 3 nodes then we'd bootstrap the cluster after discovering a majority of 2 (in case the 3rd doesn't come up).
The trouble is that with many orchestrators there's a risk that you might get too many nodes. The consequences are severe: if you expect 3 nodes but get 4 then you could end up forming 2 separate 2-node clusters and there's no way to merge them together again. I've seen this with EC2 ASGs (once, when that AWS region was suffering other failures, but that's enough for me).
In the case of ec2 and discovery, why not also allow for instance tagging to help decide as to what is going to be a master and then leave it on the end-user to configure correctly? Something along the lines of the following could work:
That's my point. Why can't it be used for bootstrapping if it is tagged properly? For initial bootstrapping, you could just make the mandate that all nodes have to be online for masters for initial bootstrap and from there, ease it up ? If there are more nodes and initial boot strapping, that becomes an end user problem ? When does initial bootstrap have to take place? Just when firing up new clusters? or if all the masters are down?
I think we're misunderstanding each other. Bootstrapping is the process of starting a cluster, for which we need to know the identities of the set of master nodes consistently across all the master nodes in the cluster. This requires significantly more coordination than is possible with tags alone.
This is only required the very first time the cluster starts up: nodes that have already joined a cluster store this information in their data folder and freshly-started nodes that are joining an existing cluster obtain this information from the cluster’s elected master.
I don't see a miscommunication. I just don't understand why the service can't be coordinated with query mechanisms to determine the nodes and they need to be manually set when you can effectively filter out EC2 nodes using:
And you will have a list of machines that are tagged to be master, running, using cluster-name X (assuming you defined your cluster name in the config.) Everything that would return there should be usable for discovery, let alone initial bootstrap, no? While I get the concept and the list of explicit masters, seems a little dated given modern era of cloud computing/dynamic hosts?
Because there's absolutely no guarantee that EC2 APIs yield results that are consistent enough for bootstrapping. It's entirely possible that you could have 4 instances running, A, B, C and D, but A only sees B via EC2's DescribeInstances API, and C only sees D. The consequence would be that you would bootstrap 2 different clusters, and that's precisely what we're trying to guarantee doesn't happen.
Indeed, it would be awesome if this problem were solvable, but given today's tools it simply doesn't seem to be. It's the same problem that other strongly-consistent stateful services face, including Zookeeper, Consul, Etcd and more, and none of these have found a satisfactory answer either. Etcd has the most attractive approach IMO, allowing you to use an existing Etcd cluster to bootstrap a new Etcd cluster, but this still raises the question of how you start the existing Etcd cluster in the first place. It's turtles all the way down
Fair enough. How frequently is the API inconsistent when returning data? I suppose maybe if you have a staggered host setup. Wouldn't requiring a hard minimum number bootstrap masters fix this? I want 3 masters, 3 masters must be online to bootstrap, no exceptions. If there are 4 I suppose it becomes a race of which one gets ignored. At that point, I would make it an enduser problem. (Especially since this is initial, the assumption is that there's nothing to lose data wise, game over, start again)
Putting a lower bound on the number of nodes needed for bootstrapping still doesn't prevent you from forming multiple clusters, because as you rightly point out you might end up with more nodes than you asked for. We're not really interested in probabilities here, because there's a lot of Elasticsearch instances out there and eventually one of them is going to hit every corner case there is.
There could well be data to lose. You might not be able to tell you've formed more than one cluster until you've started indexing into them all, at which point you can't in general merge the data back together again.
Those who wish to live dangerously via . This assumes you had setup discovery already in the past and you're already tagging your nodes. Use at your own risk, may cause potential outages or as described above split brain scenario, there is no testing, make your own backups, verify your nodes before running this, etc. One could modify it to run before deployment starts are make it even more deterministic. Either way, double check your work. I'm not responsible for you breaking things.
You must set cluster.initial_master_nodes to the same list of nodes on each node on which it is set in order to be sure that only a single cluster forms during bootstrapping and therefore to avoid the risk of data loss.
I can only recommend against doing what this script does, but at the end of the day it's your data and your decision. When it breaks you get to keep both pieces 🤷
I've added a comment linking from your script back to this thread.