Discovery with ECS, Dynamic Initial Master Nodes

brhardwick · May 28, 2019, 4:50pm

I am trying to setup a cluster with Elastic 7 on Amazon ECS. But I would like to get a bit of understanding on an approach for setting up the initial master nodes. I can bring up my stack if i explicitely assign each container host in my elasticsearch.yml. However, I dont want to create a new config file every time the underlying architecture changes. Is there a way for the EC2 Discovery plugin to find the initial master data nodes without assigning them explicitely?
This works:

cluster.name: test
network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
transport.publish_host: _ec2:privateIp_
discovery.seed_hosts: {{COMMA SEPERATED IPS}}
cluster.initial_master_nodes:
  - LIST OF IPS
node.name: _ec2:privateIp_
discovery:
  zen:
    hosts_provider: ec2
    minimum_master_nodes: 3
  ec2:
    host_type: private_ip
    tag.Name: tag
    proxy.host: proxy_url
    proxy.port: port
    protocol: http
cloud.node.auto_attributes: true
action.destructive_requires_name: true
bootstrap.memory_lock: true
xpack.security.enabled: false
xpack.ml.enabled: false
xpack.graph.enabled: false
xpack.watcher.enabled: false

But again, I dont want to explicitely note the IP's. Would implementing a Coordinating node fix this? Any ideas?

DavidTurner · May 28, 2019, 5:13pm

Once your cluster has formed, the cluster.initial_master_nodes setting is ignored and can be removed. Starting up a stateful service like Elasticsearch for the first time is always a bit special, but once it's up and running you can adjust the architecture fairly freely.

If you are using the EC2 seed provider then you do not need to set discovery.seed_hosts at all.

discovery.zen.minimum_master_nodes is ignored and deprecated in version 7. You should remove this.

discovery.zen.hosts_provider is renamed to discovery.seed_providers in version 7, although the old name still works for now. You should adjust your config to use the new setting name.

brhardwick · May 28, 2019, 5:17pm

I understand that the setting is ignored after startup, but I cant get the cluster to find master data nodes without it. This is the key problem. I want my cluster to dynamically find the master nodes without me setting them explicitly. If I am using the EC2 Discovery plugin, should that setup the initial master nodes for me?

Can you explain " If you are using the EC2 seed provider then you do not need to set discovery.seed_hosts at all."? Are you referring to the EC2 Discovery plugin?

DavidTurner · May 28, 2019, 5:22pm

No, the initial master nodes are not really anything to do with discovery. They're there to hold the first election in a brand-new cluster.

Yes, sorry, I did indeed mean the discovery-ec2 plugin.

brhardwick · May 28, 2019, 5:29pm

Then what you are saying is there is no way to dynamically create the first election in a brand new cluster? I have to hard-code that?

DavidTurner · May 28, 2019, 5:35pm

I'm not sure I'm following. Are you creating brand-new clusters on a regular basis? How are you orchestrating this? How are you doing the other one-off configuration tasks needed on new clusters, like installing templates and setting up snapshot repositories?

Could you, for instance, give the master nodes more predictable names?

brhardwick · May 28, 2019, 5:39pm

We have a use case where we will need to bring down and startup a brand new cluster and that is what im trying to orchestrate.

For one off configurations, we wait until the stack has been created and the automatically curl the API to update settings in our CloudFormation scripts.

I wouldnt want to give each node a different hard-coded name as this would require a different docker file for each node. But, I could create a coordinating node and have a hard-coded name for that which i could reference in my master/data nodes. I think you refer to this strategy elsewhere.

DavidTurner · May 28, 2019, 5:48pm

Ok, I'm not too familiar with CloudFormation so maybe someone else can comment on other possibilities too.

This doesn't seem so bad to me. It'd just be for the master-eligible nodes, not all nodes, and there's normally only three of them. Can you explain in more detail why you wouldn't want to do this?

Yes, another way to hold the first election is to have one special node that elects itself and then all the other nodes just join it. This isn't robust to the case where this one special node doesn't start successfully, but maybe it's easier for you to orchestrate.

Note that it's just the master-eligible nodes that matter for the first election. Master-ineligible nodes (i.e. data nodes, assuming you have dedicated master nodes) completely ignore the cluster.initial_master_nodes setting.

brhardwick · May 28, 2019, 5:52pm

Three is still too many as ECS would then have to dynamically choose which docker image to use, and we would have to maintain 3 different almost identical images. This, i dont think is a good idea.

This seems to be a good approach. Would the config look something like this?

cluster.name: test
node.data: false
node.master: true
network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
transport.publish_host: _ec2:privateIp_
cluster.initial_master_nodes:
  - election_node
node.name: election_node
discovery:
  zen:
    hosts_provider: ec2
  ec2:
    host_type: private_ip
    tag.Name: somethinghere
    proxy.host: somerthinghere
    proxy.port: somerthing
    protocol: http
cloud.node.auto_attributes: true
action.destructive_requires_name: true
bootstrap.memory_lock: true
xpack.security.enabled: false
xpack.ml.enabled: false
xpack.graph.enabled: false
xpack.watcher.enabled: false

DavidTurner · May 28, 2019, 5:58pm

Yes, that looks about right (except discovery.zen.hosts_provider -> discovery.seed_providers).

Note that you wouldn't set cluster.initial_master_nodes at all on any other nodes in this case.

You might also like to call this API before shutting the election node down:

POST /_cluster/voting_config_exclusions/election_node

and then this after it's gone:

DELETE /_cluster/voting_config_exclusions

That'll make sure that the cluster is ready to lose this node before it goes away. Otherwise there's a risk that you shut it down too soon leaving the other nodes without a master.

brhardwick · May 28, 2019, 6:00pm

Very helpful. Thank you. Is there any harm in leaving the election node there?

DavidTurner · May 28, 2019, 6:03pm

It depends. Does it have persistent storage or is there a chance it could restart with an empty data directory? If it could lose its data then it'll form a new, empty, cluster when restarted, because of the cluster.initial_master_nodes setting, and this could cause you some problems.

brhardwick · May 28, 2019, 6:05pm

Ok. That makes sense. Thanks for your help david

system · June 25, 2019, 6:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES 7.6 on AWS ECS Elasticsearch docker	9	1554	April 11, 2020
EC2 Discovery Plugin and cluster.initial_master_nodes settings Elasticsearch	1	598	June 25, 2019
Elasticsearch ec2 plugin configuration fail Elasticsearch	13	1372	March 22, 2020
Discovery.seed_hosts and cluster.initial_master_nodes Elasticsearch	3	812	May 25, 2023
How to create elastic search cluster on AWS-EC2 using ES-6.5.2 Elasticsearch	18	737	January 10, 2019

Discovery with ECS, Dynamic Initial Master Nodes

Related topics