A question about primary/replica re-sync implementation

iamorchid · February 13, 2020, 2:29am

Let's say we have 3 replicas for one index shard and they have the following local op seqs:
node#1(primary) : 1, 2, 3, 4, 5, 6, 7, 8 (local checkpoint: 8, max seqNo: 8, global checkpoint: 5)
node#2(replica#1): 1, 2, 3, 4, 5, 7, 8 (local checkpoint: 5, max seqNo: 8, global checkpoint: 5)
node#3(replica#2): 1, 2, 3, 4, 5, 6, 7 (local checkpoint: 7, max seqNo: 7, global checkpoint: 5)

Suppose node#1 crashed, and node#3(replica#2) is promoted to new primary. Then there will be a re-sync during which node#3 sends op 6 and 7 to node#2. Also for node#2, the re-sync would trim all its translog ops that are above max seqNo of node#3 (namely, seq 8 would be trimmed).

The question here is after re-sync, node#3 would only have changes from seq 1 to seq 7 in its local Lucene. However, node#2 would have the additional change of seq 8 (the translog trimming doesn't rollback the changes in lucene). So there could be data in-consistency between node#3 and node#2 in their lucenes after re-sync (though their translogs are consistent)。

Have I missed anything here?

nhat · February 13, 2020, 4:06am

@iamorchid When a replica detects the new primary, it will rollback its Lucene index, then recover locally up to the global checkpoint. In your example, operation #8 won't exist in the copy of node#2 after the primary-replica resync.

iamorchid · February 13, 2020, 6:30am

Thanks for your reply. Do we have such logic also for 6.4.2 ? Currently, I'm looking at 6.4.2 implemenation (and didn't notice such logic so far). Not sure if this is added in newer version.

nhat · February 13, 2020, 2:10pm

Hi @iamorchid,

It was implemented in 6.5.0 (see https://github.com/elastic/elasticsearch/pull/33473).

system · March 12, 2020, 2:10pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Specifics around local store recovery and out-of-sync shards Elasticsearch	5	502	February 23, 2022
Forcing sync of replicas Elasticsearch	5	2631	July 6, 2017
Replica Shards Recovery Elasticsearch	3	874	July 5, 2017
In ES 7.0, when using sequence numbers based recovery,primary and replica shard will be inconsistent Elasticsearch	4	190	February 23, 2023
Rolling Restart -- Local Replicas do not reuse local data (2.4.2) Elasticsearch	5	966	July 21, 2017

A question about primary/replica re-sync implementation

Related topics