Architecture for site recovery


I want to install an elasticsearch cluster on two servers.
The servers are separated by a few kilometers.

I think create a master/data node on the server 1 and a master/data node on the server 2.
Thus, if a site become unavailable, the data are still available.

Judging from documentation and many experts, a cluster need 3 nodes set as master.

Where can I put this third master node ? on server 1/2 or an other server ?
Which roles it must have ? master only or master/data ?

Thanks all for your feedbacks.


Ideally the third node would be independent, isolated from the other two. If you put it at the same site as one of the other two nodes and that site becomes unavailable then the whole cluster will be unavailable, because it needs at least two of the three masters (a majority) to work. You can, however, make the third node master-only (so it doesn't need so much disk space) and if you can wait for 7.3.0 to be released then you can make it voting-only too (so it won't ever be elected as master).

The main issue with clusters that are geographically separated is that Elasticsearch expects the node-to-node connections to be reasonably reliable, but wide-area connections are often not reliable enough. Although Elasticsearch will behave correctly even with unreliable connections between nodes it might not necessarily behave optimally.

1 Like

Thanks David for your answer.

I understand ideally the third node master-only must be installed on a third site.

You say : "If you put it at the same site as one of the other two nodes and that site becomes unavailable then the whole cluster will be unavailable, because it needs at least two of the three masters (a majority) to work"

In my case, I have only two sites with excellent network. I imagine few possible scenarios.

If I set up the following topology :

  1. Site A
  • Node 1 master and data
  • Node 3 master-only
  1. Site B
  • Node 2 master data

If the site A become unavailable, I understand the cluster still accessible in read-only mode.
If the site B become unavailable, the cluster still works normally.

If I set up the following topology with 2 nodes only :

  1. Site A
  • Node 1 master and data
  1. Site B
  • Node 2 master data

If the one of each site become unavailable, I have understood the cluster since ES 7 still accessible in read-only mode and the "split brain" problem is avoided. Can you confirm that ? When the site is recovered, the cluster is again accessible in write mode.

An excellent network that's resilient to someone accidentally putting a spade through all the supposedly-independently-routed cables at once? (True story :slightly_smiling_face:)

This sort of depends what you mean by "accessible in read-only mode". By default nodes will do their best to respond to searches even if they cannot contact the elected master node, but their functionality is severely limited and I wouldn't really call the cluster "available" without an elected master. I don't think it's a great idea to plan on being in this state for any length of time.

The cluster will recover correctly once everything is restarted and reconnected. In Elasticsearch 7 it may even be able to maintain an elected master on one site (but not the other). This is about as well as you can expect to things to work without another node at an independent third site.

Yes of course :slight_smile:

When I mean "accessible in read-only mode", I refer to the following cluster parameter : cluster.no_master_block
By default, all the write operations are blocked.
In ES 7, how the quorum is calculated ?
With a cluster of 2 nodes master data, the quorum is still equal to 2 in ES 7 ?

If the quorum is equal to 2 :
When the node 2 is down, I understand there is no master node elected and the node 1 permits to read data only (no updates, write operations are blocked).
When the node 2 is back, the cluster is reformed and permits to write new datas.
Thus, the integrity of cluster datas are conserved.

It's complicated :slight_smile:

Typically a 2-node version 7 cluster will define the quorum to be the current master on its own. So the site with the master might remain available without the other site, but the site without the master will normally not be able to continue alone.

There's some docs about this.

Hello David,

I test the case with 4 master nodes, still on two different sites, following this point of documentation :

I observe some behaviours which require more details.

I set up a cluster of four nodes.

  • Site A
    ** Node1 master data
    ** Node3 master only
  • Site B
    ** Node2 master data
    ** Node4 master only

Judging from the documentation, only third nodes have taken in account for voting configuration.
I simulate the lost of the site B with killing node2/node4 simultaneous.
A few times, a new master node is elected (node 1 or 3) and the cluster works continues to works normally with the 2 nodes of site A.
A few times, no new master node is elected and the cluster still in a bad state. It recovers his normal state only when a third node appears.

Have you an explanation please ?

Thank you.

Yes, that's expected. If you lose half or more of the master-eligible nodes in the cluster then you should not expect the cluster to remain healthy. Once again, the only way to achieve the fault tolerance you seek is with a third node on a third independent site.

You are correct that normally only three of the four master-eligible nodes will appear in the voting configuration, but it is not possible to specify which three it will be. If the three chosen nodes are nodes 1, 2 and 4 then of course terminating nodes 2 and 4 will break your cluster.

If you want site A to remain available when site B fails then you should put two master-eligible nodes in site A and one in site B.

Thanks David for these explanations, I understand better now.

In my use case, according to your feedback, I must choice if I favor site A over site B or the site B over site A. Thus, I will set up 2 master-eligible nodes on the favor site and only 1 on the other site.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.