Data loss when old master node dead and startup again

jffree · February 22, 2019, 1:43am

Describe the feature:

Elasticsearch version (bin/elasticsearch --version): elasticsearch/elasticsearch-oss:6.3.2

docker version: 18.03.1-ce

OS version (uname -a if on a Unix-like system): centos7.5

Description of the problem including expected versus actual behavior:

Steps to reproduce:

we have two node cluster, A (master), B(slave)
A node dead, and B become new master (standalone)
Post new data to B
A startup, and join cluster, B is still as a master, But new data lost, all data become old.

warkolm · February 22, 2019, 2:29am

Are you using an external volume to store the Elasticsearch data directory in?

jffree · February 22, 2019, 2:34am

Yes, we store data on external volume.
The question is new data lost, seems be recoveried.

DavidTurner · February 22, 2019, 3:04am

This means you have not set discovery.zen.minimum_master_nodes correctly. It must be 2 on each node. As the manual says:

To prevent data loss, it is vital to configure the discovery.zen.minimum_master_nodes setting

jffree · February 22, 2019, 3:08am

When A dead, we have delete discovery.zen.ping.unicast.hosts config in elasticsearch.yml in B node, so B node will run as standalone mode.

jffree · February 22, 2019, 5:20am

I have already set discovery.zen.minimum_master_nodes:1 in elasticsearch.yml, but it did not work.

jffree · February 22, 2019, 5:22am

If I set discovery.zen.minimum_master_nodes:2, if one node dead, so the cluster will not work. I do not want this happened.

Christian_Dahlqvist · February 22, 2019, 7:02am

Elasticsearch operates in a clustered mode, not master-slave. If you want a highly available cluster you therefore need a minimum of three master-eligible nodes.

vigyas · February 22, 2019, 10:41am

While setting minimum_master_nodes to 2 will avoid this, curious to know if this expected behavior or a replication edge case?

Quoting steps to repro from issues/39282

I have two nodes in cluster (A and B)

old cluster A is master, I set node.master: true in elasticsearch.yml , B is set node.master: false in
elasticsearch.yml

Stop A, update B config( remove node.master: false in elasticsearch.yml ), so B can run standalone

POST new data to B node

Startup A node and (set node.master: false in elasticsearch.yml ), B is set node.master: true in
elasticsearch.yml , so B is master in new cluster

But the new data loss!

This does not look like a split brain, at a time there was only one master. First A is master, then it is stopped and node B is made master (single node cluster), then A is made master ineligible and added back to cluster. But it seems that node A's shards override node B's shards?

Could this be an issue around allotting primary terms for shards when master B promoted its replica to primary?
How are the cluster state details from node A and node B reconciled when A joins back?

DavidTurner · February 22, 2019, 2:05pm

The data loss starts here:

Stop A, update B config( remove node.master: false in elasticsearch.yml ), so B can run standalone

Since B was not a master-eligible node it doesn't have a full copy of the cluster metadata on disk, so it starts up empty. It will have some index metadata, but maybe not all of it, and what it has could also be stale. It imports any indices it finds as dangling indices, and blindly trusts the corresponding index metadata even though this could be stale (and that includes primary terms). Re-using a primary term like this breaks all sorts of assumptions on which we rely, so from that point on the behaviour of Elasticsearch is undefined.

jffree · February 22, 2019, 2:28pm

If I delete node.master setting and set minimum_master_nodes 1 in both A and B. Then I try the same steps, it will also loss data.
But if I set minimum_master_nodes 2 when A and B both alive, and set minimum_master_nodes 1 when only one node alive, it will works.

Christian_Dahlqvist · February 22, 2019, 2:47pm

If you want high-availability and avoid data loss you need at least three master-eligible nodes in your cluster. Two is not sufficient.

jffree · February 22, 2019, 3:08pm

Because there is only two nodes in our production, so I have to try every method to avoid data loss and provide high-availability service.

Christian_Dahlqvist · February 22, 2019, 3:19pm

I would recommend trying to add a small dedicated master node somewhere. This does not require a lot of resources and would give you three master-eligible nodes even if it does not hold data.

jffree · February 25, 2019, 2:13am

I want to avoid one node dead and es can not support service. This is where the problem lies.

Christian_Dahlqvist · February 25, 2019, 7:08am

If you want to handle that automatically without manual intervention or risk of data loss that requires a minimum of 3 master eligible nodes. There is no way around this.

system · March 25, 2019, 7:18am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.