Master not discovered exception for two node cluster

I have setup a two node production ELK cluster using 7.8.0 using docker on two hosts VMs. Transport and HTTP encryption has been enabled.

Current master node

[elastic@pelk01 ~]$ curl -k https://127.0.0.1:9200/_cat/master?v -u elastic
Enter host password for user 'elastic':
id                     host           ip             node
CBFgPsODTrOAr2Y5jb5xQw 192.168.154.111 192.168.154.111 pelk02

Current node status

[elastic@pelk01 ~]$ curl -k https://127.0.0.1:9200/_cat/nodes?v -u elastic
Enter host password for user 'elastic':
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.154.110           39          99   1    0.10    0.10     0.12 dimrt     -      pelk01
192.168.154.111           37          99   0    0.12    0.10     0.06 dimrt     *      pelk02

When i stop the master ELK applications on pelk02, the cluster become unstable with the following error

[elastic@pelk01 ~]$ curl -k https://127.0.0.1:9200/_cat/nodes?v -u elastic
Enter host password for user 'elastic':
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

When i stop the ELK applications on pelk01, there is no issues and ELK works fine.

It would be helpful if you could review the below mentioned ES configuration if this is the ideal configuration for a highly available two node ES cluster ?. Also how the master_not_discovered_exception when stopping the master node can be eliminated, so i can have the cluster available for Kibana using pelk01 when pelk02 is stopped ?
I'm also planning to add a third node to the cluster in future. Please also share the ideal configuration how i can add the third node to this cluster, so i can have a highly available cluster, even when either of the three ELK application are stopped ?

elasticsearch.yml configuration on pelk01

---
cluster.name: p-elk-cluster
node.name: pelk01
network.host: 0.0.0.0
network.publish_host: 192.168.154.110
discovery.seed_hosts: ["192.168.154.110", "192.168.154.111"]
cluster.initial_master_nodes: ["192.168.154.110", "192.168.154.111"]
node.master: true
node.voting_only: false
node.data: true
node.ingest: true
node.ml: false

path.data: /usr/share/elasticsearch/data
path.logs: /usr/share/elasticsearch/logs

## X-Pack settings
xpack.license.self_generated.type: basic
xpack.monitoring.collection.enabled: true

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12

xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: certs/http.p12

elasticsearch.yml configuration on pelk02

---
cluster.name: p-elk-cluster
node.name: pelk02
network.host: 0.0.0.0
network.publish_host: 192.168.154.111
discovery.seed_hosts: ["192.168.154.110", "192.168.154.111"]
cluster.initial_master_nodes: ["192.168.154.110", "192.168.154.111"]
node.master: true
node.voting_only: false
node.data: true
node.ingest: true
node.ml: false

path.data: /usr/share/elasticsearch/data
path.logs: /usr/share/elasticsearch/logs

## X-Pack settings
xpack.license.self_generated.type: basic
xpack.monitoring.collection.enabled: true


xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12

xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: certs/http.p12

You cannot have a highly available cluster with only 2 nodes. As Elasticsearch is consensus based you need at least 3 master eligible nodes. Losing any of your nodes will therefore make your cluster not fully operable.

From the docs:

Because it’s not resilient to failures, we do not recommend deploying a two-node cluster in production.

Those docs also answer your question about three node clusters.

1 Like

Thanks @Christian_Dahlqvist for your feedback

Thanks @DavidTurner for your feedback