Attempting to add an Elasticsearch node to my current single node cluster. However, I am running into issues adding the new node.
I'm using Ansible to deploy the Elasticsearch Docker container so I'll just post the portions of the Ansible playbooks that are relevant. The current ES version I am using is 7.5.0 on both nodes.
Current ES Node:
- name: run elasticsearch docker container
docker_container:
name: elasticsearch
image: "elasticsearch:{{ elk_version_tag }}"
state: started
restart_policy: unless-stopped
user: 1000
volumes:
- /opt/elk-docker/elasticsearch/data:/usr/share/elasticsearch/data
- /opt/elasticsearch/ssl:/usr/share/elasticsearch/config/ssl
log_driver: "json-file"
log_options:
max-size: "200m"
max-file: "3"
ports:
- 9200:9200
- 9300:9300
env:
node.master: "true"
http.host: "0.0.0.0"
transport.host: "0.0.0.0"
xpack.security.enabled: "true"
xpack.monitoring.enabled: "true"
cluster.routing.allocation.disk.threshold_enabled: "true"
node.name: "elk-1"
cluster.name: "elk"
cluster.initial_master_nodes: "elk-1"
ELASTIC_PASSWORD: "{{ xpack_password }}"
ES_JAVA_OPTS: -Xms12g -Xmx12g
xpack.security.http.ssl.enabled: "true"
xpack.security.http.ssl.client_authentication: "optional"
xpack.security.transport.ssl.client_authentication: "none"
xpack.security.transport.ssl.enabled: "true"
xpack.security.http.ssl.key: /usr/share/elasticsearch/config/ssl/elasticsearch.key
xpack.security.http.ssl.certificate: /usr/share/elasticsearch/config/ssl/elasticsearch.pem
xpack.security.transport.ssl.key: /usr/share/elasticsearch/config/ssl/elasticsearch.key
xpack.security.transport.ssl.certificate: /usr/share/elasticsearch/config/ssl/elasticsearch.pem
ulimits:
- nofile:65536:65536
New Node:
- name: run elasticsearch docker container
docker_container:
name: elasticsearch
image: "elasticsearch:{{ elasticsearch_version }}"
state: started
restart_policy: unless-stopped
user: 1000
volumes:
- elasticsearch:/usr/share/elasticsearch/data
- /opt/elasticsearch/ssl:/usr/share/elasticsearch/config/ssl
log_driver: "json-file"
log_options:
max-size: "200m"
max-file: "3"
ports:
- 9200:9200
- 9300:9300
env:
http.host: "0.0.0.0"
transport.host: "0.0.0.0"
xpack.security.enabled: "true"
xpack.monitoring.enabled: "true"
xpack.security.http.ssl.enabled: "true"
xpack.security.http.ssl.certificate: "/usr/share/elasticsearch/config/ssl/elasticsearch.pem"
xpack.security.http.ssl.key: "/usr/share/elasticsearch/config/ssl/elasticsearch.key"
xpack.security.transport.ssl.enabled: "true"
xpack.security.transport.ssl.certificate: "/usr/share/elasticsearch/config/ssl/elasticsearch.pem"
xpack.security.transport.ssl.key: "/usr/share/elasticsearch/config/ssl/elasticsearch.key"
xpack.security.transport.ssl.verification_mode: "none"
discovery.seed_hosts: "master.my.domain"
node.name: "elk-2"
cluster.name: "elk"
cluster.initial_master_nodes: "elk-1"
ELASTIC_PASSWORD: "{{ elastic_password }}"
ES_JAVA_OPTS: "{{ es_java_opts }}"
ulimits:
- nofile:65536:65536
- memlock:-1:-1
I substituted my current ES node's resolveable hostname for master.my.domain
for privacy reasons. However, I know that the new node's Elasticsearch container is able to see the master node:
[elasticsearch@edb12d778337 ~]$ curl https://master.my.domain:9200
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Bearer realm=\"security\"","ApiKey","Basic realm=\"security\" charset=\"UTF-8\""]}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Bearer realm=\"security\"","ApiKey","Basic realm=\"security\" charset=\"UTF-8\""]}},"status":401}
[elasticsearch@edb12d778337 ~]$
(I didn't include credentials in the curl
request because I'm just showing that the new node can see the master)
On the master node I can see traffic on port 9300 coming from my new node via tcpdump
:
root@master:~# tcpdump -i ens160 host new-node.my.domain
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes
10:00:18.414667 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [S], seq 810488902, win 29200, options [mss 1460,sackOK,TS val 551847975 ecr 0,nop,wscale 7], length 0
10:00:18.414930 IP master.my.domain.9300 > new-node.my.domain.60378: Flags [S.], seq 2321176096, ack 810488903, win 28960, options [mss 1460,sackOK,TS val 904464596 ecr 551847975,nop,wscale 7], length 0
10:00:18.415334 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [.], ack 1, win 229, options [nop,nop,TS val 551847976 ecr 904464596], length 0
10:00:18.417440 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [P.], seq 1:333, ack 1, win 229, options [nop,nop,TS val 551847978 ecr 904464596], length 332
10:00:18.417503 IP master.my.domain.9300 > new-node.my.domain.60378: Flags [.], ack 333, win 235, options [nop,nop,TS val 904464599 ecr 551847978], length 0
10:00:18.434308 IP master.my.domain.9300 > new-node.my.domain.60378: Flags [P.], seq 1:6858, ack 333, win 235, options [nop,nop,TS val 904464616 ecr 551847978], length 6857
10:00:18.434734 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [.], ack 6858, win 336, options [nop,nop,TS val 551847996 ecr 904464616], length 0
10:00:18.437061 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [P.], seq 333:525, ack 6858, win 336, options [nop,nop,TS val 551847998 ecr 904464616], length 192
10:00:18.437777 IP master.my.domain.9300 > new-node.my.domain.60378: Flags [P.], seq 6858:7009, ack 525, win 243, options [nop,nop,TS val 904464619 ecr 551847998], length 151
10:00:18.438789 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [P.], seq 525:710, ack 7009, win 358, options [nop,nop,TS val 551848000 ecr 904464619], length 185
10:00:18.439277 IP master.my.domain.9300 > new-node.my.domain.60378: Flags [P.], seq 7009:7413, ack 710, win 252, options [nop,nop,TS val 904464621 ecr 551848000], length 404
10:00:18.439703 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [P.], seq 710:750, ack 7413, win 381, options [nop,nop,TS val 551848001 ecr 904464621], length 40
10:00:18.439815 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [F.], seq 750, ack 7413, win 381, options [nop,nop,TS val 551848001 ecr 904464621], length 0
10:00:18.440010 IP master.my.domain.9300 > new-node.my.domain.60378: Flags [F.], seq 7413, ack 751, win 252, options [nop,nop,TS val 904464621 ecr 551848001], length 0
10:00:18.440152 IP new-node.my.domain.60378 > master.my.domain.9300: Flags [.], ack 7414, win 381, options [nop,nop,TS val 551848001 ecr 904464621], length 0
10:00:19.416700 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [S], seq 2278802906, win 29200, options [mss 1460,sackOK,TS val 551848977 ecr 0,nop,wscale 7], length 0
10:00:19.416813 IP master.my.domain.9300 > new-node.my.domain.60406: Flags [S.], seq 2901458393, ack 2278802907, win 28960, options [mss 1460,sackOK,TS val 904465598 ecr 551848977,nop,wscale 7], length 0
10:00:19.417032 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [.], ack 1, win 229, options [nop,nop,TS val 551848978 ecr 904465598], length 0
10:00:19.417909 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [P.], seq 1:333, ack 1, win 229, options [nop,nop,TS val 551848979 ecr 904465598], length 332
10:00:19.417962 IP master.my.domain.9300 > new-node.my.domain.60406: Flags [.], ack 333, win 235, options [nop,nop,TS val 904465599 ecr 551848979], length 0
10:00:19.418675 IP new-node.my.domain.53310 > master.my.domain.1029: Flags [.], seq 1893618506:1893622850, ack 3895652123, win 229, options [nop,nop,TS val 2240775860 ecr 2384323137], length 4344
10:00:19.418814 IP new-node.my.domain.53310 > master.my.domain.1029: Flags [P.], seq 4344:10133, ack 1, win 229, options [nop,nop,TS val 2240775860 ecr 2384323137], length 5789
10:00:19.419077 IP master.my.domain.1029 > new-node.my.domain.53310: Flags [.], ack 10133, win 6276, options [nop,nop,TS val 2384324559 ecr 2240775860], length 0
10:00:19.422954 IP master.my.domain.1029 > new-node.my.domain.53310: Flags [P.], seq 1:7, ack 10133, win 6276, options [nop,nop,TS val 2384324563 ecr 2240775860], length 6
10:00:19.423091 IP new-node.my.domain.53310 > master.my.domain.1029: Flags [.], ack 7, win 229, options [nop,nop,TS val 2240775864 ecr 2384324563], length 0
10:00:19.436556 IP master.my.domain.9300 > new-node.my.domain.60406: Flags [P.], seq 1:6858, ack 333, win 235, options [nop,nop,TS val 904465618 ecr 551848979], length 6857
10:00:19.436962 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [.], ack 6858, win 336, options [nop,nop,TS val 551848998 ecr 904465618], length 0
10:00:19.440869 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [P.], seq 333:525, ack 6858, win 336, options [nop,nop,TS val 551849002 ecr 904465618], length 192
10:00:19.441765 IP master.my.domain.9300 > new-node.my.domain.60406: Flags [P.], seq 6858:7009, ack 525, win 243, options [nop,nop,TS val 904465623 ecr 551849002], length 151
10:00:19.443158 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [P.], seq 525:710, ack 7009, win 358, options [nop,nop,TS val 551849004 ecr 904465623], length 185
10:00:19.443528 IP master.my.domain.9300 > new-node.my.domain.60406: Flags [P.], seq 7009:7413, ack 710, win 252, options [nop,nop,TS val 904465625 ecr 551849004], length 404
10:00:19.443914 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [P.], seq 710:750, ack 7413, win 381, options [nop,nop,TS val 551849005 ecr 904465625], length 40
10:00:19.444025 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [F.], seq 750, ack 7413, win 381, options [nop,nop,TS val 551849005 ecr 904465625], length 0
10:00:19.444302 IP master.my.domain.9300 > new-node.my.domain.60406: Flags [F.], seq 7413, ack 751, win 252, options [nop,nop,TS val 904465626 ecr 551849005], length 0
10:00:19.444427 IP new-node.my.domain.60406 > master.my.domain.9300: Flags [.], ack 7414, win 381, options [nop,nop,TS val 551849005 ecr 904465626], length 0
The logs I am getting on the new node show this error:
{"type": "server", "timestamp": "2020-03-17T15:25:40,353Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elk", "node.name": "elk-2", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [elk-1] to bootstrap a cluster: have discovered [{elk-2}{M01ZLrZnQDiUQrojiPvAiQ}{RXkJwksNTsGHiJ5DclyR6g}{172.17.0.2}{172.17.0.2:9300}{dilm}{ml.machine_memory=16820195328, xpack.installed=true, ml.max_open_jobs=20}]; discovery will continue using [1xx.1xx.254.3:9300] from hosts providers and [{elk-2}{M01ZLrZnQDiUQrojiPvAiQ}{RXkJwksNTsGHiJ5DclyR6g}{172.17.0.2}{172.17.0.2:9300}{dilm}{ml.machine_memory=16820195328, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
(I have omitted the public IP from this log, the one ending in 254.3
-- the log shows the correct IP for my master host)
I don't see any error logs in Elasticsearch on my current master that seem to be relevant to this issue.
What am I doing wrong? Why can't my new ES node join the current single node cluster?
Any help would be greatly appreciated