Elasticsearch 6.2.4 nodes can’t discover each other in AWS

Hello. I am having a hard time with getting my ES nodes to discover each other.

Problem
I am getting the same error (with different ip) on each node:

[...][INFO ][o.e.d.z.ZenDiscovery     ] [node-1] 
    failed to send join request to master [{node-2}{...}{...}{10.2.0.8}{10.2.0.8:9300}{aws_availability_zone=us-east-1a}], 
    reason [RemoteTransportException[[node-2][10.2.0.8:9300][internal:discovery/zen/join]]; 
    nested: NotMasterException[Node [{node-2}{...}{...}{10.2.0.8}{10.2.0.8:9300}{aws_availability_zone=us-east-1a}] 
    not master for join request]; ], tried [3] times

telnet works fine, I can connect to '10.2.0.8:9300' from that machine.

Setup
I followed the official guide: running-elasticsearch-on-aws but it is a bit outdated. I have 3 m5.2xlarge machines in the same auto-scaling group and availability zone (at least for now). Each instance has an ElasticSearch tag with the same value: <app_name>-es-node.

/etc/sysconfig/elasticsearch:

ES_PATH_CONF=/etc/elasticsearch
ES_STARTUP_SLEEP_TIME=5
ES_HEAP_SIZE=15g
MAX_LOCKED_MEMORY=unlimited

/etc/elasticsearch/elasticsearch.yml (version 1):

cloud.node.auto_attributes: true

cluster.name: <app_name>-elasticsearch
cluster.routing.allocation.awareness.attributes: aws_availability_zone

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

bootstrap.memory_lock: false

node.name: node-1

network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
transport.publish_host: _ec2:privateIp_

s3.client.default.endpoint: s3.us-east-1.amazonaws.com

discovery.zen.hosts_provider: ec2
discovery.zen.minimum_master_nodes: 2

discovery.ec2.endpoint: ec2.us-east-1.amazonaws.com
discovery.ec2.tag.ElasticSearch: <app_name>-es-node

/etc/elasticsearch/elasticsearch.yml (version 2):

cloud.node.auto_attributes: true

cluster.name: <app_name>-elasticsearch
cluster.routing.allocation.awareness.attributes: aws_availability_zone

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

bootstrap.memory_lock: false

node.name: node-1 #other values: node-2, node-3
node.master: true
node.data: true

network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
network.bind_host: _ec2:privateIp_

transport.publish_host: _ec2:privateIp_
transport.tcp.port: 9300
http.port: 9200

s3.client.default.endpoint: s3.us-east-1.amazonaws.com

discovery.zen.hosts_provider: ec2
discovery.zen.join_timeout: 90s
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping_timeout: 30s

discovery.ec2.endpoint: ec2.us-east-1.amazonaws.com
discovery.ec2.tag.ElasticSearch: <app_name>-es-node
discovery.ec2.availability_zones: us-east-1a
discovery.ec2.node_cache_time: 120s
discovery.ec2.protocol: https

plugin.mandatory:
- discovery-ec2
- repository-s3

The error is the same for both setups. What else can i try?

Removing /var/lib/elasticsearch/node has fixed the problem.

Final config (for multiple availability zones):

cluster.name: <app_name>-elasticsearch

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

node.master: true
node.data: true

http.host: 0.0.0.0
network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
transport.publish_host: _ec2:privateIp_

discovery.zen.hosts_provider: ec2
discovery.zen.minimum_master_nodes: 2

discovery.ec2.endpoint: ec2.us-east-1.amazonaws.com
discovery.ec2.tag.ElasticSearch: <app_name>-es-node
discovery.ec2.availability_zones: us-east-1a, us-east-1c, us-east-1d
discovery.ec2.node_cache_time: 120s
discovery.ec2.protocol: https

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.