Upgrading from ES 6.8 to 7.17 and got MasterNotDiscoveredException

Hello community,

In my Elasticsearch upgrade crazyness, I was finally able to reindex my old ES 5.0 cluster into 6.8, though with lots of adjustments because of the only one mapping per index rule, but I digress.
I have installed Elasticsearch as a Docker container in raw AWS EC2 instances, and thus I'm using the EC2 Discovery plugin along with S3 Snapshot Repository.

Now I'm facing this very weird error.
I haven't changed anything from my elasticsearch.yml config, but now I have this repeated and annoying message:

[2022-10-07T21:40:46,012][WARN ][o.e.c.c.ClusterFormationFailureHelper] [basics-0-master] master not discovered yet, this node has not previously joined a 
bootstrapped (v7+) cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{basics-0-master}{Jl3...NRw}{0B9...vPA}{172.1.1.244}{172.1.1.244:9300}{ilmr}, {basics-1-master}{3aT...3AVQ}{_bK...aKA}{172.1.1.34}{172.1.1.34:9300}{
ilmr}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9301, [::1]:9302, [::1]:9303, [::1]:9304, [::1]
:9305, 172.1.1.244:9300, 172.1.1.34:9300, 172.1.1.5:9300, 172.1.1.36:9300] from hosts providers and [{basics-0-master}{Jl3...vPA}{172.1.1.244}{172.1.1.244:9300}{ilmr}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-10-07T21:40:49,535][WARN ][r.suppressed             ] [basics-0-master] path: /_cluster/health, params: {}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
        at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:297) [elasticsearch-7.17.6.jar:7.17.6]
        at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.17.6.jar:7.17.6]
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.17.6.jar:7.17.6]
        at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.17.6.jar:7.17.6]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.17.6.jar:7.17.6]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]

And then starts polling all those addresses, including two data nodes:

[2022-10-07T21:58:19,325][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [127.0.0.1:9304], node [null], requesting [false] connection failed: [][127.0.0.1:9304] connect_exception: Connection refused: /127.0.0.1:9304: Connection refused
[2022-10-07T21:58:19,325][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [127.0.0.1:9301], node [null], requesting [false] connection failed: [][127.0.0.1:9301] connect_exception: Connection refused: /127.0.0.1:9301: Connection refused
[2022-10-07T21:58:19,325][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [127.0.0.1:9302], node [null], requesting [false] connection failed: [][127.0.0.1:9302] connect_exception: Connection refused: /127.0.0.1:9302: Connection refused
[2022-10-07T21:58:19,325][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [[::1]:9305], node [null], requesting [false] connection failed: [][[::1]:9305] connect_exception: Connection refused: /[0:0:0:0:0:0:0:1]:9305: Connection refused
[2022-10-07T21:58:19,325][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [127.0.0.1:9305], node [null], requesting [false] connection failed: [][127.0.0.1:9305] connect_exception: Connection refused: /127.0.0.1:9305: Connection refused
[2022-10-07T21:58:19,325][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [127.0.0.1:9303], node [null], requesting [false] connection failed: [][127.0.0.1:9303] connect_exception: Connection refused: /127.0.0.1:9303: Connection refused
[2022-10-07T21:58:19,326][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [[::1]:9303], node [null], requesting [false] connection failed: [][[::1]:9303] connect_exception: Connection refused: /[0:0:0:0:0:0:0:1]:9303: Connection refused
[2022-10-07T21:58:19,326][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [[::1]:9301], node [null], requesting [false] connection failed: [][[::1]:9301] connect_exception: Connection refused: /[0:0:0:0:0:0:0:1]:9301: Connection refused
[2022-10-07T21:58:19,326][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [[::1]:9304], node [null], requesting [false] connection failed: [][[::1]:9304] connect_exception: Connection refused: /[0:0:0:0:0:0:0:1]:9304: Connection refused
[2022-10-07T21:58:19,326][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [[::1]:9300], node [null], requesting [false] connection failed: [][[::1]:9300] connect_exception: Connection refused: /[0:0:0:0:0:0:0:1]:9300: Connection refused
[2022-10-07T21:58:19,326][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [127.0.0.1:9300], node [null], requesting [false] connection failed: [][127.0.0.1:9300] connect_exception: Connection refused: /127.0.0.1:9300: Connection refused
[2022-10-07T21:58:19,336][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [172.1.1.36:9300], node [null], requesting [false] connection failed: [basics-0-data][172.1.1.36:9300] non-master-eligible node found
[2022-10-07T21:58:19,339][WARN ][o.e.d.PeerFinder         ] [basics-0-master] address [172.1.1.5:9300], node [null], requesting [false] connection failed: [basics-1-data][172.1.1.5:9300] non-master-eligible node found

Am I missing a config I don't know of?
This is my elasticsearch.yml for my master nodes:

cluster.name: ${CLUSTER_NAME}
node.name: ${NODE_NAME}

# HOST info
network.host: ${ES_HOST}
http.port: ${ES_PORT}

node.master: true
node.data: false

reindex.remote.whitelist: "172.1.*.*:9200"
reindex.ssl.verification_mode: none

xpack.monitoring.enabled: false
discovery.seed_providers: ec2
discovery:
  zen:
    minimum_master_nodes: 2

discovery.ec2.tag.es_cluster: ${CLUSTER_NAME}

xpack.security.audit.enabled: true

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elasticsearch_server_keystore.pfx
xpack.security.transport.ssl.truststore.path: certs/elasticsearch_server_truststore.p12
xpack.security.transport.ssl.keystore.password: ${ES_KEY_STORE_PASSWORD}
xpack.security.transport.ssl.truststore.password: ${ES_TRUST_STORE_PASSWORD}

xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: certs/elasticsearch_server_keystore.pfx
xpack.security.http.ssl.keystore.password: ${ES_KEY_STORE_PASSWORD}
xpack.security.http.ssl.truststore.password: ${ES_TRUST_STORE_PASSWORD}
xpack.security.http.ssl.certificate_authorities: ["certs/custom_ca.pem"]
xpack.security.http.ssl.client_authentication: optional
xpack.security.authc.realms.pki.pki1.order: 0

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.