Master not discovered or elected yet, an election requires a node with id [7pxH2sBjRcG2IZzFDfdKGg]

Hi to all,
On our cluster (not related to elasticsearch), we handle our own master election. (active/stdby architecture).
Im trying to setup elasticsearch to work with our arch.
We prepare the configuration file for each elastic node (active is set to master=true, stdby is set to master=false) and then run the elasticsearch process.
On startup, it works great.
But after I trigger a switchover (of our own cluster), thus so, shutting down both elastic processes, and reconfiguring them such that the previously master is now set with master=false and the other one with master=true,
the new node that should be master required the previous master to be connected.

{"type": "server", "timestamp": "2020-02-27T11:58:34,435Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "dn-elasticsearch", "node.name": "dn55-ncc1", "message": "bound or publishing to a non-loopback address, en
forcing bootstrap checks" }
{"type": "server", "timestamp": "2020-02-27T11:58:34,460Z", "level": "INFO", "component": "o.e.c.c.Coordinator", "cluster.name": "dn-elasticsearch", "node.name": "dn55-ncc1", "message": "cluster UUID [MmH0OV7KQPO8rX5TnkFatw]" }
{"type": "server", "timestamp": "2020-02-27T11:58:44,471Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "dn-elasticsearch", "node.name": "dn55-ncc1", "message": "master not discovered or elected
yet, an election requires a node with id [7pxH2sBjRcG2IZzFDfdKGg], have discovered [{dn55-ncc1}{KaGtc-BUSkq0IBLf-br3XA}{rUqDt1J9RDyyps9lxZ2PNw}{10.254.1.7}{10.254.1.7:9300}{dilm}{ml.machine_memory=67012685824, xpack.installed=true, ml.max
_open_jobs=20}] which is not a quorum; discovery will continue using from hosts providers and [{dn55-ncc1}{KaGtc-BUSkq0IBLf-br3XA}{rUqDt1J9RDyyps9lxZ2PNw}{10.254.1.7}{10.254.1.7:9300}{dilm}{ml.machine_memory=67012685824, xpack.installe
d=true, ml.max_open_jobs=20}] from last-known cluster state; node term 1, last-accepted version 43 in term 1" }
{"type": "server", "timestamp": "2020-02-27T11:58:54,472Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "dn-elasticsearch", "node.name": "dn55-ncc1", "message": "master not discovered or elected
yet, an election requires a node with id [7pxH2sBjRcG2IZzFDfdKGg], have discovered [{dn55-ncc1}{KaGtc-BUSkq0IBLf-br3XA}{rUqDt1J9RDyyps9lxZ2PNw}{10.254.1.7}{10.254.1.7:9300}{dilm}{ml.machine_memory=67012685824, xpack.installed=true, ml.max
_open_jobs=20}] which is not a quorum; discovery will continue using from hosts providers and [{dn55-ncc1}{KaGtc-BUSkq0IBLf-br3XA}{rUqDt1J9RDyyps9lxZ2PNw}{10.254.1.7}{10.254.1.7:9300}{dilm}{ml.machine_memory=67012685824, xpack.installe
d=true, ml.max_open_jobs=20}] from last-known cluster state; node term 1, last-accepted version 43 in term 1" }
{"type": "server", "timestamp": "2020-02-27T11:59:04,807Z", "level": "DEBUG", "component": "o.e.a.s.m.TransportMasterNodeAction", "cluster.name": "dn-elasticsearch", "node.name": "dn55-ncc1", "message": "no known master node, scheduling a

master config:

cluster.initial_master_nodes:
- 10.254.1.7:9300
cluster.name: dn-elasticsearch
discovery.seed_hosts:
- 10.254.1.7:9300
http.host: 0.0.0.0
http.port: {ELASTICSEARCH_HTTP_PORT} indices.memory.index_buffer_size: 512mb indices.memory.min_index_buffer_size: 32mb network.bind_host: 0.0.0.0 network.host: 0.0.0.0 network.publish_host: 10.254.1.7 node.data: true node.ingest: true node.master: true node.name: {EXT_HOSTNAME}
path.data: /techsupport
path.repo: /elasticsearch_backup
transport.bind_host: 0.0.0.0
transport.host: 10.254.1.7
transport.port: '9300'

other node config (was master before):

cluster.initial_master_nodes:
- 10.254.1.7:9300
cluster.name: dn-elasticsearch
discovery.seed_hosts:
- 10.254.1.7:9300
http.host: 0.0.0.0
http.port: {ELASTICSEARCH_HTTP_PORT} indices.memory.index_buffer_size: 512mb indices.memory.min_index_buffer_size: 32mb network.bind_host: 0.0.0.0 network.host: 0.0.0.0 network.publish_host: 10.254.1.70 node.data: true node.ingest: false node.master: false node.name: {EXT_HOSTNAME}
path.data: /techsupport
path.repo: /elasticsearch_backup
transport.bind_host: 0.0.0.0
transport.host: 10.254.1.70
transport.port: '9300'

in our cluster, one node gets 10.254.1.70 as an IP and the second gets 10.254.1.71.
10.254.1.7 is a floating IP that is always assigned to the master.

to sum it up, I need to disable all elasticsearch's election procedures as we handle all of those on our own, or at least find a way to make it work according to our constraints.
We collect all logs from our cluster to the active node, and we just need replication in case of switchovers. so it's basically a 2-nodes cluster. (other nodes are just data providers and not master eligible)

Please help me figure it out.

Best regards,
Oded

It is not even theoretically possible to do what you are trying to do without data loss, which is why Elasticsearch is stopping you from doing it. You need three master-eligible nodes to support failover. This isn't a restriction imposed by Elasticsearch so much as a fundamental property of distributed systems.

We dont need any sharding of the data, only replication of it.
We need the whole data to be at each of both nodes.

Isn't it possible in some way to configure/manipulate this?

Yes indeed: use three master-eligible nodes.

It's not really clear what problem you are trying to solve with the process you are describing. Perhaps you are trying to control precisely which node is currently elected as the master of the cluster? Can you explain why you think that matters?

Indeed, I have to control which of the nodes would be the master node as we only have 2 master eligible nodes and we have to support a state where only one of them is connected.
In addition, i need to avoid reaching a split brain state, but without adding a third master node.
(If that can help, I can add another elastic master eligible node on the node that i want to be the master)
Do you have any suggestion what's the best solution for my needs, under the above constraints?

No, sorry, I still don't understand the problem you're trying to solve so I can't offer any suggestions.

Ok, Ill keep it simple.
How do you suggest to deploy Es on 2 nodes?

There are two sensible options:

  1. both nodes master-eligible
  2. one node master-eligible and one node not.

In both cases there is an unavoidable single point of failure, but that's the best you can do if you've got fewer than three nodes. As I said above, this isn't a restriction imposed by Elasticsearch so much as a fundamental property of distributed systems.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.