I use docker-compose to keep most of my configuration in git for various applications. For the past few days I've been trying to get a test Elasticsearch cluster up and running. After way too much troubleshooting, I've narrowed the issue down to docker or docker-compose.
Basically, if I run Elasticsearch extracted from the tarball, it works just fine. But if I run it via docker-compose, the nodes never get past master discovery.
Here is my elasticsearch.yml file:
cluster.name: mycluster
node.name: testelk01.example.org
node.master: true
http.port: 9200
transport.port: 9300
network.host: _site_
discovery.seed_hosts:
- testelk01.example.org
- testelk02.example.org
- testelk03.example.org
cluster.initial_master_nodes:
- testelk01.example.org
- testelk02.example.org
- testelk03.example.org
#http.cors.enabled: true
#http.cors.allow-origin: "*"
#http.cors.allow-headers: X-Requested-With,X-Auth-Token,Content-Type,Content-Length,Authorization,Access-Control-Allow-Origin
#http.cors.allow-credentials: true
xpack.license.self_generated.type: basic
xpack.ilm.enabled: true
xpack.monitoring.enabled: true
# security settings
xpack.security.enabled: false
#xpack.security.http.ssl.enabled: true
#xpack.security.http.ssl.key: "/usr/share/elasticsearch/config/certs/mycluster_wildcard_example_org.key"
#xpack.security.http.ssl.certificate: "/usr/share/elasticsearch/config/certs/mycluster_wildcard_example_org.crt"
#xpack.security.http.ssl.certificate_authorities:
# - "/usr/share/elasticsearch/config/certs/DigiCertCA.crt"
# - "/usr/share/elasticsearch/config/certs/DigiCertTrustedRoot.crt"
#
#
#xpack.security.transport.ssl.enabled: true
#xpack.security.transport.ssl.verification_mode: none
#xpack.security.transport.ssl.key: "/usr/share/elasticsearch/config/certs/mycluster_wildcard_example_org.key"
#xpack.security.transport.ssl.certificate: "/usr/share/elasticsearch/config/certs/mycluster_wildcard_example_org.crt"
#xpack.security.transport.ssl.certificate_authorities:
# - "/usr/share/elasticsearch/config/certs/DigiCertCA.crt"
# - "/usr/share/elasticsearch/config/certs/DigiCertTrustedRoot.crt"
path.data: /usr/share/elasticsearch/data
path.logs: /usr/share/elasticsearch/logs
bootstrap.memory_lock: false
My docker-compose file:
version: '3.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.2
container_name: testelk01_elasticsearch
environment:
- "ES_JAVA_OPTS=-Xms6144m -Xmx6144m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- ./config/elasticsearch/testelk01/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- ./certs/:/usr/share/elasticsearch/config/certs/
- /mnt/elasticsearch_iscsi/testelk01/elasticsearch/data:/usr/share/elasticsearch/data
- /mnt/elasticsearch_iscsi/testelk01/elasticsearch/logs:/usr/share/elasticsearch/logs
ports:
- "< internel ip 1 >:9200:9200"
- "< internel ip 1 >:9300-9400:9300-9400"
healthcheck:
test: ["CMD", "curl","-s" ,"-f", "http://localhost:9200/_cat/health"]
networks:
- elknet
extra_hosts:
- "testelk01.example.org:< internel ip 1 >"
- "testelk02.example.org:< internel ip 2 >"
- "testelk03.example.org:< internel ip 3 >"
restart: always
kibana:
container_name: testelk01_kibana
image: docker.elastic.co/kibana/kibana:7.6.2
volumes:
- ./config/kibana/testelk01/kibana.yml:/usr/share/kibana/config/kibana.yml
- /mnt/elasticsearch_iscsi/testelk01/kibana/data:/usr/share/kibana/data
- ./certs/:/usr/share/kibana/config/certs/
ports:
- 127.0.0.1:5601:5601
networks:
- elknet
extra_hosts:
- "testelk01.example.org:< internel ip 1 >"
- "testelk02.example.org:< internel ip 2 >"
- "testelk03.example.org:< internel ip 3 >"
restart: always
networks:
elknet:
driver: bridge
driver_opts:
com.docker.network.bridge.name: elknet
Just adjust the node name to 02 and 03, and you'll have the file for my other two nodes.
I had been trying to set things up with ssl, but I commented all that out while troubleshooting.
To test, I set up a temp directory for data and logs, adjusted my elasticsearch.yml file to use them, and ran this on each node:
ES_PATH_CONF=/srv/elktemp/elasticsearch-7.6.2/config ./elasticsearch
It worked just fine. They discovered each other, and curl'ing the heath said there were 3 nodes and it was green.
But if I take nearly the exact same config and run it via docker-compose, the nodes never find each other.
They eventually just keep repeating this message:
testelk03_elasticsearch | {"type": "server", "timestamp": "2020-04-27T23:03:00,987Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "mycluster", "node.name": "testelk03.example.org", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [testelk01.example.org, testelk02.example.org, testelk03.example.org] to bootstrap a cluster: have discovered [{testelk03.example.org}{eFxpuLZ0RPCbmi1Toiqm9A}{6Ts-bEguRDaVqFq0SV4bog}{172.18.0.2}{172.18.0.2:9300}{dilm}{ml.machine_memory=8364195840, xpack.installed=true, ml.max_open_jobs=20}]; discovery will continue using [< internel ip 1 >:9300, < internel ip 2 >:9300, < internel ip 3 >:9300] from hosts providers and [{testelk03.example.org}{eFxpuLZ0RPCbmi1Toiqm9A}{6Ts-bEguRDaVqFq0SV4bog}{172.18.0.2}{172.18.0.2:9300}{dilm}{ml.machine_memory=8364195840, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
It shouldn't be a networking or firewall issue. I exec'd into a container, installed nmap, and checked that port 9300 on another node was open. It was. So was 9200.
I even tried turning off the host firewall.
Since it works when I remove Docker from the equation, that must mean Docker is the issue, but I'm really not sure how. If I can communicate with the other containers from within them, then Elasticsearch should be able to as well.
Any ideas?
Thanks in advance.