Issue with Multihost Docker Setup Docker Compose

I am trying to run 2(1x2) masters nodes across two physical hosts via docker-compose

Dockerfile

    FROM docker.elastic.co/elasticsearch/elasticsearch:7.6.0
    RUN mkdir /usr/share/elasticsearch/data{1..2}
    RUN chown elasticsearch:elasticsearch /usr/share/elasticsearch/data{1..2}

First Node: Hostname --> one.example.com --> 172.21.195.14

version: '2.2'
services:
  node01-master:
    build: .
    container_name: node01-master
    hostname: one.example.com
    environment:
      - node.name=one.example.com
      - cluster.name=es-cluster
      - discovery.seed_hosts=two.example.com
      - cluster.initial_master_nodes=one.example.com,two.example.com
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms2G -Xmx2G"
      - node.master=true 
      - node.voting_only=false
      - node.data=false
      - node.ingest=false
      - node.ml=false
      - xpack.ml.enabled=true
      - cluster.remote.connect=false
    cpus: "2"
    mem_limit: 4G
    ulimits:
      memlock:
        soft: -1
        hard: -1
    ports:
      - "9200:9200"
      - "9300:9300"
    networks:
      - elastic
    restart: always
  networks:
    elastic:
      driver: bridge

Second Node --> two.example.com --> 172.21.195.13

version: '2.2'
services:
  node02-master:
    build: .
    container_name: node02-master
    hostname: two.example.com
    environment:
      - node.name=two.example.com
      - cluster.name=es-cluster
      - discovery.seed_hosts=one.example.com
      - cluster.initial_master_nodes=one.example.com,two.example.com
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms2G -Xmx2G"
      - node.master=true 
      - node.voting_only=false
      - node.data=false
      - node.ingest=false
      - node.ml=false
      - xpack.ml.enabled=true
      - cluster.remote.connect=false
    cpus: "2"
    mem_limit: 4G
    ulimits:
      memlock:
        soft: -1
        hard: -1
    ports:
      - "9200:9200"
      - "9300:9300"
    networks:
      - elastic
    restart: always
  networks:
    elastic:
      driver: bridge

Both the containers are coming up. But not detecting the other master. The tcp ports (9200,9300) are accessible by hostnames from both containers to the other.

Logs from node01-master

master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [one.example.com, two.example.com] to bootstrap a cluster: have discovered [{one.example.com}{g-rSNLSUQVuuPRxVGkJjqw}{fW02vz0cRh2z-8SX5coNWw}{172.30.0.4}{172.30.0.4:9300}{m}{xpack.installed=true}]; discovery will continue using [172.21.195.14:9300] from hosts providers and [{one.example.com}{g-rSNLSUQVuuPRxVGkJjqw}{fW02vz0cRh2z-8SX5coNWw}{172.30.0.4}{172.30.0.4:9300}{m}{xpack.installed=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0"

Curl from node01-master container

# curl two.example.com:9300
This is not an HTTP port

# curl two.example.com:9200
{
  "name" : "two.example.com",
  "cluster_name" : "es-cluster",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "7.6.0",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "7f634e9f44834fbc12724506cc1da681b0c3b1e3",
    "build_date" : "2020-02-06T00:09:00.449973Z",
    "build_snapshot" : false,
    "lucene_version" : "8.4.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

What am I missing here

This node has not discovered the other one. Can they both communicate with each other at their publish addresses? This node's publish address is 172.30.0.4:9300 but it's looking for the other one at 172.21.195.14:9300 which is quite different.

I note that you're using a bridge network. The docs for that say the following:

Bridge networks apply to containers running on the same Docker daemon host. For communication among containers running on different Docker daemon hosts, you can either manage routing at the OS level, or you can use an overlay network.

This is the IP of the container. 172.21.195.14 is the IP of node01

This helped! I got it running by changing the network.publish_host to match the host's IP address. I now have 2(1×2)master nodes, 2(1×2) coordinating nodes and 6(3×2) data only nodes spread across 2 physical hosts with docker and without swarm

Great, thanks for letting us know. I adjusted your post slightly: the forum interpreted your * signs as italics so I replaced them with × signs.

In future versions I think we will give a clearer indication in the logs that this is the issue thanks to https://github.com/elastic/elasticsearch/pull/51304.

Note that if you only have 2 master-eligible nodes then you should not expect your cluster to be resilient to the loss of either. You need at least three master-eligible nodes for that.

1 Like

The setup was a starter for a three node setup. Now I have
3(1×3)master nodes, 3(1×3) coordinating nodes and 9(3×3) data nodes

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.