Data loss when old master node dead and startup again

Describe the feature:

Elasticsearch version (bin/elasticsearch --version): elasticsearch/elasticsearch-oss:6.3.2

docker version: 18.03.1-ce

OS version (uname -a if on a Unix-like system): centos7.5

Description of the problem including expected versus actual behavior:

Steps to reproduce:

  1. we have two node cluster, A (master), B(slave)
  2. A node dead, and B become new master (standalone)
  3. Post new data to B
  4. A startup, and join cluster, B is still as a master, But new data lost, all data become old.

Are you using an external volume to store the Elasticsearch data directory in?

Yes, we store data on external volume.
The question is new data lost, seems be recoveried.

This means you have not set discovery.zen.minimum_master_nodes correctly. It must be 2 on each node. As the manual says:

To prevent data loss, it is vital to configure the discovery.zen.minimum_master_nodes setting

When A dead, we have delete config in elasticsearch.yml in B node, so B node will run as standalone mode.

I have already set discovery.zen.minimum_master_nodes:1 in elasticsearch.yml, but it did not work.

If I set discovery.zen.minimum_master_nodes:2, if one node dead, so the cluster will not work. I do not want this happened.

Elasticsearch operates in a clustered mode, not master-slave. If you want a highly available cluster you therefore need a minimum of three master-eligible nodes.

While setting minimum_master_nodes to 2 will avoid this, curious to know if this expected behavior or a replication edge case?

Quoting steps to repro from issues/39282

  1. I have two nodes in cluster (A and B)
  2. old cluster A is master, I set node.master: true in elasticsearch.yml , B is set node.master: false in
  3. Stop A, update B config( remove node.master: false in elasticsearch.yml ), so B can run standalone
  4. POST new data to B node
  5. Startup A node and (set node.master: false in elasticsearch.yml ), B is set node.master: true in
    elasticsearch.yml , so B is master in new cluster
  6. But the new data loss!

This does not look like a split brain, at a time there was only one master. First A is master, then it is stopped and node B is made master (single node cluster), then A is made master ineligible and added back to cluster. But it seems that node A's shards override node B's shards?

  • Could this be an issue around allotting primary terms for shards when master B promoted its replica to primary?

  • How are the cluster state details from node A and node B reconciled when A joins back?

The data loss starts here:

  1. Stop A, update B config( remove node.master: false in elasticsearch.yml ), so B can run standalone

Since B was not a master-eligible node it doesn't have a full copy of the cluster metadata on disk, so it starts up empty. It will have some index metadata, but maybe not all of it, and what it has could also be stale. It imports any indices it finds as dangling indices, and blindly trusts the corresponding index metadata even though this could be stale (and that includes primary terms). Re-using a primary term like this breaks all sorts of assumptions on which we rely, so from that point on the behaviour of Elasticsearch is undefined.

1 Like

If I delete node.master setting and set minimum_master_nodes 1 in both A and B. Then I try the same steps, it will also loss data.
But if I set minimum_master_nodes 2 when A and B both alive, and set minimum_master_nodes 1 when only one node alive, it will works.

If you want high-availability and avoid data loss you need at least three master-eligible nodes in your cluster. Two is not sufficient.

Because there is only two nodes in our production, so I have to try every method to avoid data loss and provide high-availability service.

I would recommend trying to add a small dedicated master node somewhere. This does not require a lot of resources and would give you three master-eligible nodes even if it does not hold data.


I want to avoid one node dead and es can not support service. This is where the problem lies.

If you want to handle that automatically without manual intervention or risk of data loss that requires a minimum of 3 master eligible nodes. There is no way around this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.