We are trying to upgrade our Elastic Stack deployment to run on Ubuntu Jammy (22.04) (The docker is based on Ubuntu 20.04.6 this is 7.17.12 it also happens with the 7.17.28) and docker version 28.0.1. on the hosts. Previously we were running Ubuntu Bionic (18.04) and docker version 18.06.3-ce. The elasticsearch.yaml is the same as before:
---
cluster.name: elastic-docker-cluster
network.host: pmd43test-elastic-1.platform-lab.cloud.xxx.org
network.bind_host: 0.0.0.0
cluster.initial_master_nodes: ["pmd43test-elastic-1", "pmd43test-elastic-2", "pmd43test-elastic-3"]
discovery.seed_hosts: ["pmd43test-elastic-2.ops.platform-lab.intra", "pmd43test-elastic-3.ops.platform-lab.intra"]
node.data: true
node.master: true
node.name: "pmd43test-elastic-1"
node.ingest: true
node.ml: false
## X-Pack settings
xpack.license.self_generated.type: basic
This is for node 1. The others differ only in respective node numbers.
Running a curl command on the node give this result:
$ curl "localhost:9200/_cat/indices?v"
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
In the elasticsearch docker logs I see this message:
{"type": "server", "timestamp": "2025-02-28T20:28:07,663Z", "level": "WARN", "co
mponent": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elastic-dock
er-cluster", "node.name": "pmd43test-elastic-1", "message": "master not discover
ed yet, this node has not previously joined a bootstrapped (v7+) cluster, and th
is node must discover master-eligible nodes [pmd43test-elastic-1, pmd43test-elas
tic-2, pmd43test-elastic-3] to bootstrap a cluster: have discovered [{pmd43test-
elastic-1}{ysP7nojvR9WERKGI-w5R-g}{hpo7cInWTjKjo_JyLFMXyw}{pmd43test-elastic-1.p
latform-lab.cloud.xxx.org}{127.0.1.1:9300}{cdfhimrstw}]; discovery will continue
using [172.17.2.61:9300, 172.17.3.30:9300] from hosts providers and [{pmd43test
-elastic-1}{ysP7nojvR9WERKGI-w5R-g}{hpo7cInWTjKjo_JyLFMXyw}{pmd43test-elastic-1.
platform-lab.cloud.xxx.org}{127.0.1.1:9300}{cdfhimrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
and this:
{"type": "server", "timestamp": "2025-02-28T20:28:05,372Z", "level": "WARN", "component": "o.e.d.PeerFinder", "cluster.name": "elastic-docker-cluster", "node.name": "pmd43test-elastic-1", "message": "address [172.17.2.61:9300], node [null], requesting [false] connection failed: [pmd43test-elastic-2][127.0.1.1:9300] handshake failed. unexpected remote node {pmd43test-elastic-1}{ysP7nojvR9WERKGI-w5R-g}{hpo7cInWTjKjo_JyLFMXyw}{pmd43test-elastic-1.platform-lab.cloud.xxx.org}{127.0.1.1:9300}{cdfhimrstw}{xpack.installed=true, transform.node=true}" }
{"type": "server", "timestamp": "2025-02-28T20:28:02,370Z", "level": "WARN", "component": "o.e.d.HandshakingTransportAddressConnector", "cluster.name": "elastic-docker-cluster", "node.name": "pmd43test-elastic-1", "message": "[connectToRemoteMasterNode[172.17.2.61:9300]] completed handshake with [{pmd43test-elastic-2}{cV8rAE6rQgOod3BvAWrWDw}{ahXw6NLvTNy8AYdvm9ld-g}{pmd43test-elastic-2.platform-lab.cloud.xxx.org}{127.0.1.1:9300}{cdfhimrstw}{xpack.installed=true, transform.node=true}] but followup connection failed",
"stacktrace": ["org.elasticsearch.transport.ConnectTransportException: [pmd43test-elastic-2][127.0.1.1:9300] handshake failed. unexpected remote node {pmd43test-elastic-1}{ysP7nojvR9WERKGI-w5R-g}{hpo7cInWTjKjo_JyLFMXyw}{pmd43test-elastic-1.platform-lab.cloud.xxx.org}{127.0.1.1:9300}{cdfhimrstw}{xpack.installed=true, transform.node=true}",
...
I've done a lot of searching and comparing between the 2 deployments and haven't had any luck. I hope someone here can tell me where to look.
Thanks.