Master node role cannot be removed in 8.0.0

I have 5 nodes, 3 of them are master. Running on podman on RHEL as systemD service.
By error, I set the 2 others as master and ran the cluster but they should be data only.
When I change their Elasticsearch.yml to remove node.role master for these 2, the cluster is unable to start. Each other node is trying to connect and fail.
However I tried the exact same configuration in 7.17.1 where I can remove the master role correctly.
I think this is a bug in version 8.0.0.

Possibly, but you'll need to share a lot more detail to help us understand it. Could you provide the logs from all nodes from startup for at least 10 minutes?

5 Nodes: es01 (master,data), es02 (master,data), es03 (master,data), eswarm01 (data_warm), escold01 (data_cold)
I tried to provide a selected preview of the logs which are very long.

es01 log:


> {"type": "server", "timestamp": "2022-03-01T19:13:27,163Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "node name [es01], node ID [CDPAYocARvC_f9FtrBwusg], cluster name [es-docker-cluster], roles [data, transform, master, ingest]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:49,835Z", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "[controller/303] [Main.cc@123] controller (64 bit): Version 8.0.0 (Build 5e85495ea85316) Copyright (c) 2022 Elasticsearch BV" }
> {"type": "server", "timestamp": "2022-03-01T19:13:51,089Z", "level": "INFO", "component": "o.e.x.s.Security", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "Security is disabled" }
> {"type": "server", "timestamp": "2022-03-01T19:13:52,896Z", "level": "INFO", "component": "o.e.t.n.NettyAllocator", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "creating NettyAllocator with the following configs: [name=unpooled, suggested_max_allocation_size=1mb, factors={es.unsafe.use_unpooled_allocator=null, g1gc_enabled=true, g1gc_region_size=4mb, heap_size=1gb}]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:53,081Z", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "using discovery type [zen] and seed hosts providers [settings]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,113Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "initialized" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,113Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "starting ..." }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,171Z", "level": "INFO", "component": "o.e.x.s.c.f.PersistentCache", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "persistent cache index loaded" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,173Z", "level": "INFO", "component": "o.e.x.d.l.DeprecationIndexingComponent", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "deprecation component started" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,575Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "publish_address {10.88.0.18:9301}, bound_addresses {[::]:9301}" }
> {"type": "server", "timestamp": "2022-03-01T19:14:01,356Z", "level": "INFO", "component": "o.e.m.j.JvmGcMonitorService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "[gc][4] overhead, spent [321ms] collecting in the last [1.1s]" }
> {"type": "server", "timestamp": "2022-03-01T19:14:03,409Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "bound or publishing to a non-loopback address, enforcing bootstrap checks" }
> {"type": "server", "timestamp": "2022-03-01T19:14:03,411Z", "level": "INFO", "component": "o.e.c.c.Coordinator", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "cluster UUID [nlYLMj0JQwuRWH2Bfk7YhQ]" }
> {"type": "server", "timestamp": "2022-03-01T19:14:13,422Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:23,424Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:33,425Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:33,469Z", "level": "WARN", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "timed out while waiting for initial discovery state - timeout: 30s" }
> {"type": "server", "timestamp": "2022-03-01T19:14:33,479Z", "level": "INFO", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "publish_address {10.88.0.18:9201}, bound_addresses {[::]:9201}" }
> {"type": "server", "timestamp": "2022-03-01T19:14:33,480Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "started" }
> {"type": "server", "timestamp": "2022-03-01T19:14:43,427Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{sMG9CukDRZCET0NNQ0Pw6w}{10.88.0.18}{10.88.0.18:9301}{dimt}] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:19:03,547Z", "level": "WARN", "component": "o.e.d.PeerFinder", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "address [127.0.0.1:9300], node [null], requesting [false] connection failed: [escold01][10.88.0.18:9300] non-master-eligible node found" }

es02 log:
Only a subset to show it has the same error

> {"type": "server", "timestamp": "2022-03-01T19:13:29,237Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "node name [es02], node ID [hrXk0AOHRWqTiLmzZdbghw], cluster name [es-docker-cluster], roles [transform, data, ingest, master]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:52,634Z", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "[controller/304] [Main.cc@123] controller (64 bit): Version 8.0.0 (Build 5e85495ea85316) Copyright (c) 2022 Elasticsearch BV" }
> {"type": "server", "timestamp": "2022-03-01T19:13:53,540Z", "level": "INFO", "component": "o.e.x.s.Security", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "Security is disabled" }
> {"type": "server", "timestamp": "2022-03-01T19:13:56,249Z", "level": "INFO", "component": "o.e.t.n.NettyAllocator", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "creating NettyAllocator with the following configs: [name=unpooled, suggested_max_allocation_size=1mb, factors={es.unsafe.use_unpooled_allocator=null, g1gc_enabled=true, g1gc_region_size=4mb, heap_size=1gb}]" }
> 
> {"type": "server", "timestamp": "2022-03-01T19:13:56,537Z", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "using discovery type [zen] and seed hosts providers [settings]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:59,824Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "initialized" }
> {"type": "server", "timestamp": "2022-03-01T19:13:59,825Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "starting ..." }
> {"type": "server", "timestamp": "2022-03-01T19:13:59,954Z", "level": "INFO", "component": "o.e.x.s.c.f.PersistentCache", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "persistent cache index loaded" }
> {"type": "server", "timestamp": "2022-03-01T19:13:59,955Z", "level": "INFO", "component": "o.e.x.d.l.DeprecationIndexingComponent", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "deprecation component started" }
> {"type": "server", "timestamp": "2022-03-01T19:14:00,310Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "publish_address {10.88.0.18:9303}, bound_addresses {[::]:9303}" }
> {"type": "server", "timestamp": "2022-03-01T19:14:04,898Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "bound or publishing to a non-loopback address, enforcing bootstrap checks" }
> {"type": "server", "timestamp": "2022-03-01T19:14:04,901Z", "level": "INFO", "component": "o.e.c.c.Coordinator", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "cluster UUID [nlYLMj0JQwuRWH2Bfk7YhQ]" }
> {"type": "server", "timestamp": "2022-03-01T19:14:14,918Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es02", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es02}{hrXk0AOHRWqTiLmzZdbghw}{HN9pbrMWRNuAa_Q7eQAlPg}{10.88.0.18}{10.88.0.18:9303}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es02}{hrXk0AOHRWqTiLmzZdbghw}{HN9pbrMWRNuAa_Q7eQAlPg}{10.88.0.18}{10.88.0.18:9303}{dimt}] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }

es03 log:
Only a subset to show it has the same error

> {"type": "server", "timestamp": "2022-03-01T19:13:33,043Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "node name [es03], node ID [SofMGtv0TmCPQdFltXAMOQ], cluster name [es-docker-cluster], roles [transform, data, ingest, master]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:50,658Z", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "[controller/304] [Main.cc@123] controller (64 bit): Version 8.0.0 (Build 5e85495ea85316) Copyright (c) 2022 Elasticsearch BV" }
> {"type": "server", "timestamp": "2022-03-01T19:13:51,308Z", "level": "INFO", "component": "o.e.x.s.Security", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "Security is disabled" }
> {"type": "server", "timestamp": "2022-03-01T19:13:53,243Z", "level": "INFO", "component": "o.e.t.n.NettyAllocator", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "creating NettyAllocator with the following configs: [name=unpooled, suggested_max_allocation_size=1mb, factors={es.unsafe.use_unpooled_allocator=null, g1gc_enabled=true, g1gc_region_size=4mb, heap_size=1gb}]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:53,450Z", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "using discovery type [zen] and seed hosts providers [settings]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,312Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "initialized" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,312Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "starting ..." }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,384Z", "level": "INFO", "component": "o.e.x.s.c.f.PersistentCache", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "persistent cache index loaded" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,385Z", "level": "INFO", "component": "o.e.x.d.l.DeprecationIndexingComponent", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "deprecation component started" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,600Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "publish_address {10.88.0.18:9302}, bound_addresses {[::]:9302}" }
> {"type": "server", "timestamp": "2022-03-01T19:13:59,394Z", "level": "INFO", "component": "o.e.m.j.JvmGcMonitorService", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "[gc][2] overhead, spent [338ms] collecting in the last [1s]" }
> {"type": "server", "timestamp": "2022-03-01T19:14:03,930Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "bound or publishing to a non-loopback address, enforcing bootstrap checks" }
> {"type": "server", "timestamp": "2022-03-01T19:14:03,937Z", "level": "INFO", "component": "o.e.c.c.Coordinator", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "cluster UUID [nlYLMj0JQwuRWH2Bfk7YhQ]" }
> {"type": "server", "timestamp": "2022-03-01T19:14:13,954Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}] from last-known cluster state; node term 280, last-accepted version 8313 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:23,956Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}] from last-known cluster state; node term 280, last-accepted version 8313 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:33,958Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}] from last-known cluster state; node term 280, last-accepted version 8313 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:33,991Z", "level": "WARN", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "timed out while waiting for initial discovery state - timeout: 30s" }
> {"type": "server", "timestamp": "2022-03-01T19:14:34,002Z", "level": "INFO", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "publish_address {10.88.0.18:9202}, bound_addresses {[::]:9202}" }
> {"type": "server", "timestamp": "2022-03-01T19:14:34,002Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "started" }
> {"type": "server", "timestamp": "2022-03-01T19:14:43,960Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es03", "message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [CDPAYocARvC_f9FtrBwusg, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw], have only discovered non-quorum [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [{es03}{SofMGtv0TmCPQdFltXAMOQ}{iAXss026Q5OwAimO71bB8Q}{10.88.0.18}{10.88.0.18:9302}{dimt}] from last-known cluster state; node term 280, last-accepted version 8313 in term 280" }

escold01 log:
This is not a master node

> {"type": "server", "timestamp": "2022-03-01T19:13:48,364Z", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "[controller/304] [Main.cc@123] controller (64 bit): Version 8.0.0 (Build 5e85495ea85316) Copyright (c) 2022 Elasticsearch BV" }
> {"type": "server", "timestamp": "2022-03-01T19:13:49,138Z", "level": "INFO", "component": "o.e.x.s.Security", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "Security is disabled" }
> {"type": "server", "timestamp": "2022-03-01T19:13:51,745Z", "level": "INFO", "component": "o.e.t.n.NettyAllocator", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "creating NettyAllocator with the following configs: [name=unpooled, suggested_max_allocation_size=1mb, factors={es.unsafe.use_unpooled_allocator=null, g1gc_enabled=true, g1gc_region_size=4mb, heap_size=1gb}]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:52,077Z", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "using discovery type [zen] and seed hosts providers [settings]" }
> {"type": "server", "timestamp": "2022-03-01T19:13:56,426Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "initialized" }
> {"type": "server", "timestamp": "2022-03-01T19:13:56,426Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "starting ..." }
> {"type": "server", "timestamp": "2022-03-01T19:13:56,482Z", "level": "INFO", "component": "o.e.x.s.c.f.PersistentCache", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "persistent cache index loaded" }
> {"type": "server", "timestamp": "2022-03-01T19:13:56,484Z", "level": "INFO", "component": "o.e.x.d.l.DeprecationIndexingComponent", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "deprecation component started" }
> {"type": "server", "timestamp": "2022-03-01T19:13:57,000Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "publish_address {10.88.0.18:9300}, bound_addresses {[::]:9300}" }
> {"type": "server", "timestamp": "2022-03-01T19:14:02,909Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "bound or publishing to a non-loopback address, enforcing bootstrap checks" }
> {"type": "server", "timestamp": "2022-03-01T19:14:02,911Z", "level": "INFO", "component": "o.e.c.c.Coordinator", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "cluster UUID [nlYLMj0JQwuRWH2Bfk7YhQ]" }
> {"type": "server", "timestamp": "2022-03-01T19:14:12,959Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "master not discovered yet: have discovered [{escold01}{y-_AN6loRMK9j1wTTEA-Ug}{aoPzZBHqQ-O0bD_1hnfHBQ}{10.88.0.18}{10.88.0.18:9300}{c}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:22,961Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "master not discovered yet: have discovered [{escold01}{y-_AN6loRMK9j1wTTEA-Ug}{aoPzZBHqQ-O0bD_1hnfHBQ}{10.88.0.18}{10.88.0.18:9300}{c}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:32,962Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "master not discovered yet: have discovered [{escold01}{y-_AN6loRMK9j1wTTEA-Ug}{aoPzZBHqQ-O0bD_1hnfHBQ}{10.88.0.18}{10.88.0.18:9300}{c}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:14:32,972Z", "level": "WARN", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "timed out while waiting for initial discovery state - timeout: 30s" }
> {"type": "server", "timestamp": "2022-03-01T19:14:32,984Z", "level": "INFO", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "publish_address {10.88.0.18:9200}, bound_addresses {[::]:9200}" }
> {"type": "server", "timestamp": "2022-03-01T19:14:32,984Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "started" }
> {"type": "server", "timestamp": "2022-03-01T19:14:42,963Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "master not discovered yet: have discovered [{escold01}{y-_AN6loRMK9j1wTTEA-Ug}{aoPzZBHqQ-O0bD_1hnfHBQ}{10.88.0.18}{10.88.0.18:9300}{c}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300, 127.0.0.1:9300] from hosts providers and [] from last-known cluster state; node term 280, last-accepted version 8314 in term 280" }
> {"type": "server", "timestamp": "2022-03-01T19:19:03,098Z", "level": "WARN", "component": "o.e.d.PeerFinder", "cluster.name": "es-docker-cluster", "node.name": "escold01", "message": "address [127.0.0.1:9300], node [null], requesting [false] connection failed: [escold01][10.88.0.18:9300] local node found" }

As a result, Kibana does not load with the error:

> [2022-03-01T21:06:34.338+00:00][INFO ][plugins.ruleRegistry] Installing common resources shared between all indices
> [2022-03-01T21:06:35.113+00:00][INFO ][plugins.screenshotting.config] Chromium sandbox provides an additional layer of protection, and is supported for Linux Ubuntu 20.04 OS. Automatically enabling Chromium sandbox.
> [2022-03-01T21:06:38.314+00:00][INFO ][plugins.screenshotting.chromium] Browser executable: /usr/share/kibana/x-pack/plugins/screenshotting/chromium/headless_shell-linux_x64/headless_shell
> [2022-03-01T21:08:35.850+00:00][FATAL][root] TimeoutError: Request timed out
>     at KibanaTransport.request (/usr/share/kibana/node_modules/@elastic/transport/lib/Transport.js:503:31)
>     at runMicrotasks (<anonymous>)
>     at processTicksAndRejections (node:internal/process/task_queues:96:5)
>     at KibanaTransport.request (/usr/share/kibana/src/core/server/elasticsearch/client/create_transport.js:63:16)
>     at Cluster.getSettings (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/api/api/cluster.js:158:16)
>     at isInlineScriptingEnabled (/usr/share/kibana/src/core/server/elasticsearch/is_scripting_enabled.js:24:7)
>     at ElasticsearchService.start (/usr/share/kibana/src/core/server/elasticsearch/elasticsearch_service.js:121:32)
>     at Server.start (/usr/share/kibana/src/core/server/server.js:319:32)
>     at Root.start (/usr/share/kibana/src/core/server/root/index.js:69:14)
>     at bootstrap (/usr/share/kibana/src/core/server/bootstrap.js:120:5)
>     at Command.<anonymous> (/usr/share/kibana/src/cli/serve/serve.js:216:5)
> [2022-03-01T21:08:35.864+00:00][INFO ][plugins-system.preboot] Stopping all plugins.
> [2022-03-01T21:08:35.865+00:00][INFO ][plugins-system.standard] Stopping all plugins.
> [2022-03-01T21:08:35.865+00:00][INFO ][plugins.monitoring.monitoring.kibana-monitoring] Monitoring stats collection is stopped
> [2022-03-01T21:09:05.870+00:00][WARN ][plugins-system.standard] "eventLog" plugin didn't stop in 30sec., move on to the next.
> 
>  FATAL  TimeoutError: Request timed out

If I change Elasticsearch.yml of node escold01 to set master role, everything works fine.

I don't think you've picked out the useful messages from these logs. Please share them in full. Use https://gist.github.com if they're too big to fit here.

Ah actually I see it now

You haven't configured discovery correctly. You need to tell each node the addresses of all the other master nodes (usually by setting discovery.seed_hosts although there are alternatives). The current config is saying that every master node is at 127.0.0.1:9300 which sounds completely wrong.

1 Like

Thank you for taking a look.
See the seed_hosts settings.

Here is an extraction from the different Elasticsearch.yml

> node.name: es01
> cluster.name: es-docker-cluster
> node.roles: master,data,ingest,transform
> network.host: 0.0.0.0
> discovery.seed_hosts: ["es02","es03","eswarm01","escold01"]
> cluster.initial_master_nodes: ["es01","es02","es03"]
> node.name: es02
> cluster.name: es-docker-cluster
> node.roles: master,data,ingest,transform
> network.host: 0.0.0.0
> discovery.seed_hosts: ["es01","es03","eswarm01","escold01"]
> cluster.initial_master_nodes: ["es01","es02","es03"]
> node.name: es03
> cluster.name: es-docker-cluster
> node.roles: master,data,ingest,transform
> network.host: 0.0.0.0
> discovery.seed_hosts: ["es01","es02","eswarm01","escold01"]
> cluster.initial_master_nodes: ["es01","es02","es03"]
> node.name: eswarm01
> cluster.name: es-docker-cluster
> node.roles: ["data_warm"]
> network.host: 0.0.0.0
> discovery.seed_hosts: ["es01","es02","es03","escold01"]
> cluster.initial_master_nodes: ["es01","es02","es03"]
> node.name: escold01
> cluster.name: es-docker-cluster
> node.roles: ["data_cold"]
> network.host: 0.0.0.0
> discovery.seed_hosts: ["es01","es02","es03","eswarm01"]
> cluster.initial_master_nodes: ["es01","es02","es03"]

If I just add "master" in the node.roles of eswarm01 and escold01, everything loads fine, I can see it in kibana.

I added role "master" to eswarm01 and escold01 nodes.
Here is the result of es01 log.

> {"type": "server", "timestamp": "2022-03-01T22:39:26,080Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "node name [es01], node ID [CDPAYocARvC_f9FtrBwusg], cluster name [es-docker-cluster], roles [ingest, master, transform, data]" }
> {"type": "server", "timestamp": "2022-03-01T22:39:47,403Z", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "[controller/303] [Main.cc@123] controller (64 bit): Version 8.0.0 (Build 5e85495ea85316) Copyright (c) 2022 Elasticsearch BV" }
> {"type": "server", "timestamp": "2022-03-01T22:39:47,810Z", "level": "INFO", "component": "o.e.x.s.Security", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "Security is disabled" }
> {"type": "server", "timestamp": "2022-03-01T22:39:50,923Z", "level": "INFO", "component": "o.e.t.n.NettyAllocator", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "creating NettyAllocator with the following configs: [name=unpooled, suggested_max_allocation_size=1mb, factors={es.unsafe.use_unpooled_allocator=null, g1gc_enabled=true, g1gc_region_size=4mb, heap_size=1gb}]" }
> {"type": "server", "timestamp": "2022-03-01T22:39:51,107Z", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "using discovery type [zen] and seed hosts providers [settings]" }
> {"type": "server", "timestamp": "2022-03-01T22:39:54,479Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "initialized" }
> {"type": "server", "timestamp": "2022-03-01T22:39:54,479Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "starting ..." }
> {"type": "server", "timestamp": "2022-03-01T22:39:54,604Z", "level": "INFO", "component": "o.e.x.s.c.f.PersistentCache", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "persistent cache index loaded" }
> {"type": "server", "timestamp": "2022-03-01T22:39:54,605Z", "level": "INFO", "component": "o.e.x.d.l.DeprecationIndexingComponent", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "deprecation component started" }
> {"type": "server", "timestamp": "2022-03-01T22:39:54,948Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "publish_address {10.88.0.19:9303}, bound_addresses {[::]:9303}" }
> {"type": "server", "timestamp": "2022-03-01T22:39:59,163Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "bound or publishing to a non-loopback address, enforcing bootstrap checks" }
> {"type": "server", "timestamp": "2022-03-01T22:39:59,180Z", "level": "INFO", "component": "o.e.c.c.Coordinator", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "cluster UUID [nlYLMj0JQwuRWH2Bfk7YhQ]" }
> {"type": "server", "timestamp": "2022-03-01T22:39:59,589Z", "level": "WARN", "component": "o.e.d.PeerFinder", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "address [10.88.0.19:9301], node [{eswarm01}{VTa0mSKaS32Rg2VL47P0OQ}{3Yxu1I44Q-2oSR9MGqGWxQ}{10.88.0.19}{10.88.0.19:9301}{mw}{xpack.installed=true}], requesting [false] peers request failed",
> "stacktrace": ["org.elasticsearch.transport.NodeDisconnectedException: [eswarm01][10.88.0.19:9301][internal:discovery/request_peers] disconnected"] }
> {"type": "server", "timestamp": "2022-03-01T22:39:59,623Z", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "elected-as-master ([2] nodes joined)[{es03}{SofMGtv0TmCPQdFltXAMOQ}{RcMTEhUpREiTKXcSgWN3NQ}{10.88.0.19}{10.88.0.19:9300}{dimt} elect leader, {es01}{CDPAYocARvC_f9FtrBwusg}{UXnTjzYzSry7AQyTGJ32fw}{10.88.0.19}{10.88.0.19:9303}{dimt} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 281, version: 8315, delta: master node changed {previous [], current [{es01}{CDPAYocARvC_f9FtrBwusg}{UXnTjzYzSry7AQyTGJ32fw}{10.88.0.19}{10.88.0.19:9303}{dimt}]}, added {{es03}{SofMGtv0TmCPQdFltXAMOQ}{RcMTEhUpREiTKXcSgWN3NQ}{10.88.0.19}{10.88.0.19:9300}{dimt}}" }
> {"type": "server", "timestamp": "2022-03-01T22:40:00,254Z", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master node changed {previous [], current [{es01}{CDPAYocARvC_f9FtrBwusg}{UXnTjzYzSry7AQyTGJ32fw}{10.88.0.19}{10.88.0.19:9303}{dimt}]}, added {{es03}{SofMGtv0TmCPQdFltXAMOQ}{RcMTEhUpREiTKXcSgWN3NQ}{10.88.0.19}{10.88.0.19:9300}{dimt}}, term: 281, version: 8315, reason: Publication{term=281, version=8315}" }
> {"type": "server", "timestamp": "2022-03-01T22:40:00,431Z", "level": "INFO", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "publish_address {10.88.0.19:9201}, bound_addresses {[::]:9201}", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:00,431Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "started", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:00,435Z", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "node-join[{eswarm01}{VTa0mSKaS32Rg2VL47P0OQ}{3Yxu1I44Q-2oSR9MGqGWxQ}{10.88.0.19}{10.88.0.19:9301}{mw} join existing leader, {es03}{SofMGtv0TmCPQdFltXAMOQ}{RcMTEhUpREiTKXcSgWN3NQ}{10.88.0.19}{10.88.0.19:9300}{dimt} join existing leader], term: 281, version: 8316, delta: added {{eswarm01}{VTa0mSKaS32Rg2VL47P0OQ}{3Yxu1I44Q-2oSR9MGqGWxQ}{10.88.0.19}{10.88.0.19:9301}{mw}}", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:01,271Z", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "added {{eswarm01}{VTa0mSKaS32Rg2VL47P0OQ}{3Yxu1I44Q-2oSR9MGqGWxQ}{10.88.0.19}{10.88.0.19:9301}{mw}}, term: 281, version: 8316, reason: Publication{term=281, version=8316}", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:02,628Z", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "node-join[{escold01}{y-_AN6loRMK9j1wTTEA-Ug}{GdAjA7CtTKiSAqVY0mM15w}{10.88.0.19}{10.88.0.19:9302}{cm} join existing leader], term: 281, version: 8318, delta: added {{escold01}{y-_AN6loRMK9j1wTTEA-Ug}{GdAjA7CtTKiSAqVY0mM15w}{10.88.0.19}{10.88.0.19:9302}{cm}}", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:02,651Z", "level": "INFO", "component": "o.e.m.j.JvmGcMonitorService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "[gc][8] overhead, spent [418ms] collecting in the last [1s]", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:03,491Z", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "added {{escold01}{y-_AN6loRMK9j1wTTEA-Ug}{GdAjA7CtTKiSAqVY0mM15w}{10.88.0.19}{10.88.0.19:9302}{cm}}, term: 281, version: 8318, reason: Publication{term=281, version=8318}", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:03,498Z", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "node-join[{es02}{hrXk0AOHRWqTiLmzZdbghw}{ElvgvrEcSxqlE-mHSjSgPA}{10.88.0.19}{10.88.0.19:9304}{dimt} join existing leader], term: 281, version: 8319, delta: added {{es02}{hrXk0AOHRWqTiLmzZdbghw}{ElvgvrEcSxqlE-mHSjSgPA}{10.88.0.19}{10.88.0.19:9304}{dimt}}", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }
> {"type": "server", "timestamp": "2022-03-01T22:40:04,172Z", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "added {{es02}{hrXk0AOHRWqTiLmzZdbghw}{ElvgvrEcSxqlE-mHSjSgPA}{10.88.0.19}{10.88.0.19:9304}{dimt}}, term: 281, version: 8319, reason: Publication{term=281, version=8319}", "cluster.uuid": "nlYLMj0JQwuRWH2Bfk7YhQ", "node.id": "CDPAYocARvC_f9FtrBwusg"  }

Kibana is able to load.
But this is not what I want, eswarm01 and escold01 nodes should not be master.

The problem is that these hostnames all resolve to 127.0.0.1.

What should I put then?

These nodes are all in their own container inside a common pod.

They can be hostnames, but they need to resolve to an IP that isn't a loopback one.
Otherwise you might need to use IPs.

I am sorry I do not understand.
These values are copied from the initial docker-compose settings. This is the way seed-hosts are defined when running in a pod of containers.
Please suggest some values, I am lost.

Looks like your master nodes are all running on nonstandard ports, so you'll have to specify the ports: 10.88.0.18:9301, 10.88.0.18:9302, 10.88.0.18:9303. When running multiple nodes in the same place you should typically set transport.port on each one so that you know where to find them. Otherwise they will just choose the first available port which is very confusing and will change each time they start up.

I tried to set transport.port and seed_hosts like you recommend. I noticed the existing assigned address in the logs is 10.88.0.19 so I used it.
I also restricted the seed-hosts to the master nodes.
So es01 uses port 9301, es01 uses port 9302 and es03 uses port 9303

Example of es01.Elasticsearch.yml

> node.name: es01
> cluster.name: es-docker-cluster
> node.roles: master,data,ingest,transform
> network.host: 0.0.0.0
> transport.port: 9301
> discovery.seed_hosts: ["10.88.0.19:9302","10.88.0.19:9303"]
> cluster.initial_master_nodes: ["es01","es02","es03"]

It still cannot discover the other nodes. Seems I am missing some configuration.

> {"type": "server", "timestamp": "2022-03-02T13:32:06,707Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "publish_address {10.88.0.20:9301}, bound_addresses {[::]:9301}" }
> {"type": "server", "timestamp": "2022-03-02T13:32:12,294Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "bound or publishing to a non-loopback address, enforcing bootstrap checks" }
> {"type": "server", "timestamp": "2022-03-02T13:32:12,316Z", "level": "INFO", "component": "o.e.c.c.Coordinator", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "cluster UUID [nlYLMj0JQwuRWH2Bfk7YhQ]" }
> {"type": "server", "timestamp": "2022-03-02T13:32:22,364Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 3 nodes with ids from [CDPAYocARvC_f9FtrBwusg, VTa0mSKaS32Rg2VL47P0OQ, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw, y-_AN6loRMK9j1wTTEA-Ug], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}]; discovery will continue using [10.88.0.19:9302, 10.88.0.19:9303, 10.88.0.19:9304, 10.88.0.19:9305] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}] from last-known cluster state; node term 281, last-accepted version 8410 in term 281" }
> {"type": "server", "timestamp": "2022-03-02T13:32:32,366Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 3 nodes with ids from [CDPAYocARvC_f9FtrBwusg, VTa0mSKaS32Rg2VL47P0OQ, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw, y-_AN6loRMK9j1wTTEA-Ug], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}]; discovery will continue using [10.88.0.19:9302, 10.88.0.19:9303, 10.88.0.19:9304, 10.88.0.19:9305] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}] from last-known cluster state; node term 281, last-accepted version 8410 in term 281" }
> {"type": "server", "timestamp": "2022-03-02T13:32:42,368Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 3 nodes with ids from [CDPAYocARvC_f9FtrBwusg, VTa0mSKaS32Rg2VL47P0OQ, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw, y-_AN6loRMK9j1wTTEA-Ug], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}]; discovery will continue using [10.88.0.19:9302, 10.88.0.19:9303, 10.88.0.19:9304, 10.88.0.19:9305] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}] from last-known cluster state; node term 281, last-accepted version 8410 in term 281" }
> {"type": "server", "timestamp": "2022-03-02T13:32:42,416Z", "level": "WARN", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "timed out while waiting for initial discovery state - timeout: 30s" }
> {"type": "server", "timestamp": "2022-03-02T13:32:42,441Z", "level": "INFO", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "publish_address {10.88.0.20:9200}, bound_addresses {[::]:9200}" }
> {"type": "server", "timestamp": "2022-03-02T13:32:42,441Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "started" }
> {"type": "server", "timestamp": "2022-03-02T13:32:52,370Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "master not discovered or elected yet, an election requires at least 3 nodes with ids from [CDPAYocARvC_f9FtrBwusg, VTa0mSKaS32Rg2VL47P0OQ, SofMGtv0TmCPQdFltXAMOQ, hrXk0AOHRWqTiLmzZdbghw, y-_AN6loRMK9j1wTTEA-Ug], have only discovered non-quorum [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}]; discovery will continue using [10.88.0.19:9302, 10.88.0.19:9303, 10.88.0.19:9304, 10.88.0.19:9305] from hosts providers and [{es01}{CDPAYocARvC_f9FtrBwusg}{DuFMBx54Rm2FZ_jG70QINw}{10.88.0.20}{10.88.0.20:9301}{dimt}] from last-known cluster state; node term 281, last-accepted version 8410 in term 281" }

Do you have any suggestion?

Are the other nodes running at 10.88.0.19:9302, 10.88.0.19:9303, 10.88.0.19:9304, 10.88.0.19:9305? This node's address is now 10.88.0.20:9301 which is different.

They are all running now under 10.88.0.20. I do not understand where this address comes from. It is not specified in Elasticsearch.yml. It seems it is set dynamically.
Is there a way to set it in yml file?

Elasticsearch just uses the address of the network interface it sees. If you have multiple network interfaces then it chooses one of them. You can set network.host: 10.88.0.20 to force it to choose that one.

Thanks,
In the case of containers in a host server, should the host be like network.host: 10.88.0.20 or use the real address of the host server
(in my case the server that hosts the containers is 172.16.82.115)
We are also looking at distributing the containers across 2 servers, like having es01 and es02 on one server, and es03 on another server.

It depends on your network config but Elasticsearch doesn't care. See these docs for more details:

Well it does not seem to work.
I changed network.host to the VM IP address real value.

Here is new yml file (the other nodes are same with each their own transport.port 9302 to 9305)

> node.name: es01
> cluster.name: es-docker-cluster
> node.roles: master,data,ingest,transform
> network.host: 172.16.82.115
> transport.port: 9301
> discovery.seed_hosts: ["172.16.82.115:9302","172.16.82.115:9303"]

Is there any reason it "Failed to bind to 172.16.82.115:9301"

> {"type": "server", "timestamp": "2022-03-02T15:48:56,522Z", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/mapper/rhel-root)]], net usable_space [53.6gb], net total_space [69.9gb], types [xfs]" }
> {"type": "server", "timestamp": "2022-03-02T15:48:56,522Z", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "heap size [1gb], compressed ordinary object pointers [true]" }
> {"type": "server", "timestamp": "2022-03-02T15:48:56,643Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "node name [es01], node ID [CDPAYocARvC_f9FtrBwusg], cluster name [es-docker-cluster], roles [master, transform, data, ingest]" }
> {"type": "server", "timestamp": "2022-03-02T15:49:18,018Z", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "[controller/302] [Main.cc@123] controller (64 bit): Version 8.0.0 (Build 5e85495ea85316) Copyright (c) 2022 Elasticsearch BV" }
> {"type": "server", "timestamp": "2022-03-02T15:49:18,664Z", "level": "INFO", "component": "o.e.x.s.Security", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "Security is disabled" }
> {"type": "server", "timestamp": "2022-03-02T15:49:21,340Z", "level": "INFO", "component": "o.e.t.n.NettyAllocator", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "creating NettyAllocator with the following configs: [name=unpooled, suggested_max_allocation_size=1mb, factors={es.unsafe.use_unpooled_allocator=null, g1gc_enabled=true, g1gc_region_size=4mb, heap_size=1gb}]" }
> {"type": "server", "timestamp": "2022-03-02T15:49:21,523Z", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "using discovery type [zen] and seed hosts providers [settings]" }
> {"type": "server", "timestamp": "2022-03-02T15:49:25,207Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "initialized" }
> {"type": "server", "timestamp": "2022-03-02T15:49:25,210Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "starting ..." }
> {"type": "server", "timestamp": "2022-03-02T15:49:25,262Z", "level": "INFO", "component": "o.e.x.s.c.f.PersistentCache", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "persistent cache index loaded" }
> {"type": "server", "timestamp": "2022-03-02T15:49:25,263Z", "level": "INFO", "component": "o.e.x.d.l.DeprecationIndexingComponent", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "deprecation component started" }
> {"type": "server", "timestamp": "2022-03-02T15:49:25,495Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "cluster.name": "es-docker-cluster", "node.name": "es01", "message": "uncaught exception in thread [main]",
> "stacktrace": ["org.elasticsearch.bootstrap.StartupException: org.elasticsearch.transport.BindTransportException: Failed to bind to 172.16.82.115:9301",
> "at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "Caused by: org.elasticsearch.transport.BindTransportException: Failed to bind to 172.16.82.115:9301",
> "at org.elasticsearch.transport.TcpTransport.bindToPort(TcpTransport.java:435) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.transport.TcpTransport.bindServer(TcpTransport.java:396) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.transport.netty4.Netty4Transport.doStart(Netty4Transport.java:136) ~[?:?]",
> "at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:48) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.transport.TransportService.doStart(TransportService.java:269) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:48) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.node.Node.start(Node.java:1116) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:272) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:367) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-8.0.0.jar:8.0.0]",
> "... 6 more",
> "Caused by: java.net.BindException: Cannot assign requested address",
> "at sun.nio.ch.Net.bind0(Native Method) ~[?:?]",
> "at sun.nio.ch.Net.bind(Net.java:555) ~[?:?]",
> "at sun.nio.ch.ServerSocketChannelImpl.netBind(ServerSocketChannelImpl.java:337) ~[?:?]",
> "at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:294) ~[?:?]",
> "at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134) ~[?:?]",
> "at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:562) ~[?:?]",
> "at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334) ~[?:?]",
> "at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506) ~[?:?]",
> "at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:491) ~[?:?]",
> "at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:973) ~[?:?]",
> "at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:260) ~[?:?]",
> "at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:356) ~[?:?]",
> "at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[?:?]",
> "at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) ~[?:?]",
> "at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[?:?]",
> "at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[?:?]",
> "at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]",
> "at java.lang.Thread.run(Thread.java:833) [?:?]"] }
> uncaught exception in thread [main]
> org.elasticsearch.transport.BindTransportException: Failed to bind to 172.16.82.115:9301
> Likely root cause: java.net.BindException: Cannot assign requested address
>         at java.base/sun.nio.ch.Net.bind0(Native Method)
>         at java.base/sun.nio.ch.Net.bind(Net.java:555)
>         at java.base/sun.nio.ch.ServerSocketChannelImpl.netBind(ServerSocketChannelImpl.java:337)
>         at java.base/sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:294)
>         at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134)
>         at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:562)
>         at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334)
>         at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506)
>         at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:491)
>         at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:973)
>         at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:260)
>         at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:356)
>         at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
>         at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
>         at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
>         at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>         at java.base/java.lang.Thread.run(Thread.java:833)
> For complete error details, refer to the log at /usr/share/elasticsearch/logs/es-docker-cluster.log

This is covered in the docs I linked:

Elasticsearch can only bind to an address if it is running on a host that has a network interface with that address.