Elasticsearch 7.0 bootstrapping in AWS

(David Lambert) #1

Has anyone else gone through the fun of bringing up a new cluster in AWS now that we have this new "bootstrapping" stuff in 7.0?

The problem I seem to have is that the cluster wants me to specify the master nodes to bootstrap up the cluster. The issue is, I am bringing up these nodes under an AWS autoscaling group and I really don't know what they are until they are up.

I would have thought that the "discovery.seed_providers: ec2" would have taken care of this for me, but alas... No... it's still wanted to see "cluster.initial_master_nodes" and a list of who can be the first master.

Anyway... just curious if anyone else has already gone through this and can shed some light on how to do this. I guess I could do some user-data magic and build a list of my master nodes and stick them in the elastic config... but that seems like a lot of work... and really a bit hacky...



(David Turner) #2

Hi David,

Perhaps the simplest way to launch a brand-new cluster in an ASG is actually to launch it outside the ASG, with known node names, and then attach the instances to the ASG. Alternatively you can launch the cluster without bootstrapping it in the ASG and then briefly run another Elasticsearch node to do the bootstrapping. I also know of some people who are using an external Zookeeper to orchestrate the node names and the bootstrapping, although it's only really worth doing this if you're launching a lot of new clusters.

It's not recommended to have each instance wait for the other instances to come up using the EC2 APIs before bootstrapping (like discovery-ec2 would). The upper bound on an ASG isn't a guarantee, and the EC2 APIs don't always return the full picture, so it's possible you could get 4 nodes instead of 3 and accidentally bootstrap two clusters.

(David Lambert) #3

OK... thanks for the input...

I'm not sure I agree with it... but then I'm not privy to all the issues with starting up a new cluster either.

For future releases, it would be great if the discovery-ec2 module would handle this sort of thing and take the burden off us.

(Oleg Khoruzhenko) #4

I've stumbled upon the same issue with bootstrapping 7.0 in AWS. We have infrastructure as a code and every single AWS resource is managed from CloudFormation. And manipulating of ec2 instances or even ASGs outside of CloudFormation to bootstrap a cluster seems to me even more error-prone process.

I guess, with discovery.zen.minimum_master_nodes parameter removed there is no other reliable way for discovery-ec2 plugin to determine the initial number of master eligible nodes. As @DavidTurner mentioned the AWS API might be flaky.

However, the functionality that used to be in the plugin will still need to be workarounded in UserData.


(David Turner) #5

In my experience the start of a new stateful service is always a bit "special", no matter what service it is or how it's managed. After setting up a service for the first time there's normally some other one-time initialisation needed (e.g. creating templates, restoring from snapshots), which is quite different from the normal day-to-day maintenance of a running service.

For the record, I'd also very much like for Elasticsearch to integrate even more tightly with EC2 auto-scaling groups, but as far as we can tell the means to do so correctly currently doesn't exist. Autoscaling works with data nodes, of course, and also works with master-eligible nodes once bootstrapped, but there doesn't seem to be a truly satisfactory solution for bootstrapping right now.

We surveyed a number of other strongly-consistent stateful services during development (e.g. Zookeeper, etcd, Consul) and found these to be in pretty much the same boat. I'm open to ideas.

(Oleg Khoruzhenko) #6

David, appreciate your response.
I'm pretty sure my idea was already discussed inside the Elastic team, but might worth asking anyway. Would it make sense to add cluster.minimum_initial_master_nodes setting and let the discovery plugins do their job? Once the minimum number of master nodes has discovered the cluster is considered as bootstrapped and the setting is no longer required and ignored.

(David Turner) #7

It doesn't, unfortunately, although we did try this idea. The logic we used was to say that if we expect 3 nodes then we'd bootstrap the cluster after discovering a majority of 2 (in case the 3rd doesn't come up).

The trouble is that with many orchestrators there's a risk that you might get too many nodes. The consequences are severe: if you expect 3 nodes but get 4 then you could end up forming 2 separate 2-node clusters and there's no way to merge them together again. I've seen this with EC2 ASGs (once, when that AWS region was suffering other failures, but that's enough for me).