I have 2 servers, and create elasticsearch nodes in the 2 servers. the content of docker-compose.yml files are like these:
es0:
image: elasticsearch:7.6.0
container_name: es0
environment:
- "ES_JAVA_OPTS=-Xms1024m -Xmx1024m"
ulimits:
memlock:
soft: -1
hard: -1
ports:
- 9200:9200
- 9300:9300
volumes:
- "/mnt/docker/es0/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml"
- "/mnt/docker/es0/data:/usr/share/elasticsearch/data"
- "/mnt/docker/es0/plugins:/usr/share/elasticsearch/plugins"
- "/mnt/docker/es0/config/cert:/usr/share/elasticsearch/config/cert"
es1:
image: elasticsearch:7.6.0
container_name: es1
environment:
- "ES_JAVA_OPTS=-Xms1024m -Xmx1024m"
ulimits:
memlock:
soft: -1
hard: -1
ports:
- 9200:9200
- 9300:9300
volumes:
- "/mnt/docker/es1/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml"
- "/mnt/docker/es1/data:/usr/share/elasticsearch/data"
- "/mnt/docker/es1/plugins:/usr/share/elasticsearch/plugins"
- "/mnt/docker/es1/config/cert:/usr/share/elasticsearch/config/cert"
and I configured the elasticsearch.yml like these:
cluster.name: hs-cluster
node.name: es-00
node.master: true
node.data: true
http.host: 0.0.0.0
http.port: 9200
transport.host: 0.0.0.0
transport.tcp.port: 9300
#network.host: 0.0.0.0
network.bind_host: ["192.168.0.2", "101.xx.xx.136"]
network.publish_host: 192.168.0.2
gateway.recover_after_nodes: 1
http.cors.enabled: true
http.cors.allow-origin: "*"
cluster.initial_master_nodes: ["es-00", "es-01"]
discovery.seed_hosts: [ "192.168.0.2:9300", "192.168.0.3:9300" ]
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
cluster.name: hs-cluster
node.name: es-01
node.master: true
node.data: true
http.host: 0.0.0.0
http.port: 9200
transport.host: 0.0.0.0
transport.tcp.port: 9300
#network.host: 0.0.0.0
network.bind_host: ["192.168.0.3", "101.xx.xx.137"]
network.publish_host: 192.168.0.3
gateway.recover_after_nodes: 1
http.cors.enabled: true
http.cors.allow-origin: "*"
cluster.initial_master_nodes: ["es-00", "es-01"]
discovery.seed_hosts: [ "192.168.0.2:9300", "192.168.0.3:9300" ]
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
when I run the instances, they all started successfully. But when I call _cluster/state?pretty, they all gave the error message:
{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}
that means they can't find each other. I also tried to set network.host: 0.0.0.0
but the result was the same. Who know the reason of this master not discovered exception? How to resolve it?
btw, I can ran the cluster in the same server with docker compose. But in different servers, it is failed. I also ran telnet xxx 9300 in each server, they all connected.
And I add networks to docker-compose.yml,
and change network configuration in elasticsearch.yml:
network.host: 0.0.0.0
network.publish_host: 192.168.0.2
and restart instances again. Then I got the error message like this:
master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [es-00, es-01] to bootstrap a cluster:
have discovered [{es-00}{0laFYaAxRr22MfbxvFZSlw}{tEDB7BzSQ2am1L30XTTcBQ}{172.20.0.2}{172.20.0.2:9300}{dilm}{ml.machine_memory=8201236480, xpack.installed=true, ml.max_open_jobs=20}];
discovery will continue using [192.168.0.2:9300, 192.168.0.3:9300] from hosts providers
and [{es-00}{0laFYaAxRr22MfbxvFZSlw}{tEDB7BzSQ2am1L30XTTcBQ}{172.20.0.2}{172.20.0.2:9300}{dilm}{ml.machine_memory=8201236480, xpack.installed=true, ml.max_open_jobs=20}]
from last-known cluster state; node term 0, last-accepted version 0 in term 0