Docker elasticsearch can't find master

Hello, Elk buds,

Having an issue with my elasticsearch nodes discovering one another

  • Elk 7.4
  • Running in docker
    • the same VLan no firewalls blocking
  • they are running on 2 separate compute nodes

Elk master node docker-compose file

version: '2.3'
services:
#### ELK
## Elasticsearch service
  elasticsearch:
    container_name: elasticsearch
    restart: always
    
    environment:
     - cluster.name=elastic
     - node.name=siem01-a.dal.sync.lan
     - discovery.seed_hosts=siem01-b.dal.sync.lan
     - cluster.initial_master_nodes=siem01-a.dal.sync.lan
     - bootstrap.memory_lock=true
     - "ES_JAVA_OPTS=-Xms5g -Xmx5g"
     - ES_TMPDIR=/tmp
    cap_add:
     - IPC_LOCK
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    mem_limit: 10g
    ports:
     - "127.0.0.1:9200:9200"
     - "127.0.0.1:9300:9300"
     - "127.0.0.1:9301:9305"
     
    image: "Help me elk/elasticsearch:2"
    volumes:
      - /data:/data
    network_mode: "host"

(node spins up and starts processing data)

Second node in the cluster Compose file (one that is having a hard time connecting)

version: '2.3'
services:
#### ELK
## Elasticsearch service
  siem01-b.dal.sync.lan:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.4.0
    container_name: siem01-b.dal.sync.lan
    environment:
      - node.name=siem01-b.dal.sync.lan
      - discovery.seed_hosts=siem01-a.dal.sync.lan
      - node.data=true
      - cluster.initial_master_nodes=siem01-a.dal.sync.lan
      - cluster.name=elastic
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - xpack.security.enabled=true
    ulimits:
      memlock:
        soft: -1
        hard: -1
    mem_limit: 10g
    ports:
     - "9200:9200"
     - "9300:9300"
     - "9301:9305"
    volumes:
      - /data:/data
    network_mode: "host"

Log message from node 2 
WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elastic", "node.name": "siem01-b.dal.sync.lan", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes

Here is an example that works for me on 6.x. These are two different servers on the same network. It uses any servers listed in the discovery.zen.ping.unicast.hosts list on startup to see if there is an existing cluster. If it doesn't find one, it will start its own cluster.

environment:
  - "cluster.name=docker-cluster"
  - "node.name=SERVER1"
  - "network.publish_host=SERVER1"
  - "discovery.zen.ping.unicast.hosts=SERVER2"
  - "xpack.security.enabled=false"
  - "xpack.monitoring.collection.enabled=true"
  - "ES_JAVA_OPTS=-Xms4g -Xmx4g"

sadly elk took out some of those settings in 7.0

Any recommendations elk community?

This is not the complete log message, and the missing bit is the bit that describes the problem. You'll need to share the full message.

Also...

From the docs:

You should not use this setting when restarting a cluster or adding a new node to an existing cluster.

You have an existing (one-node) cluster so you should not be using this setting any more.

@DavidTurner Do you know why we don't have to use this setting when restarting a cluster or adding a new node to an existing cluster ?

I didn't noticed this before, so I conserved it in my configuration, I made some operation (add anew node,rolling upgrade...) and didn't had any issue

Yes, the answer to that is in the docs:

This is only required the first time a cluster starts up: nodes that have already joined a cluster store this information in their data folder for use in a full cluster restart, and freshly-started nodes that are joining a running cluster obtain this information from the cluster’s elected master.

That says why the setting is unnecessary, but we make a stronger statement and recommend actually removing it after bootstrapping. This is to help avoid two (surprisingly common) orchestration bugs:

  1. accidentally using ephemeral storage for the master nodes
  2. adjusting cluster.initial_master_nodes as the cluster grows and shrinks

If cluster.initial_master_nodes is in place then it may be possible to form a completely new cluster when restarting or adding nodes which is disastrous if that wasn't what you meant. Elasticsearch does what it can to prevent this disaster but that's very much on a best-effort basis and is not watertight. It's much safer to block new cluster formation by removing the setting once it's no longer needed.

1 Like